Lecture 7: please see 2009/l6.txt and Chapter 17 for the details of adaptive lock implementations. We also touched upon cache coherence logic as it applies to spinlocks; see the video at cliff-click-on-x86-url.txt from 36min mark. Lecture 8: OpenSolaris optimizes its mutexes based on the two assumptions: 1. Critical sections are short, and once entered by a CPU, will be over very soon (faster than the context switches involved in blocking a waiting thread on a turnstile so that some other thread could run, and then waking it up). Hence the idea that a thread should spin if the mutex is held by a thread currently on a CPU, and block otherwise. 2. Most kernel mutexes in the kernel can be adaptive (i.e., can block without causing scheduling trouble), are not hotly contested, and therefore most threads in most cases will find a mutex BOTH *adaptive* and *not taken*. Hence this path -- adaptive && not taken -- and the mutex imlementation data structure (see mutex_impl.c union) are *co-optimized* in assembly to a single lock-ed (atomic) instruction and return: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/ia32/ml/lock_prim.s#512 -- comment http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/ia32/ml/lock_prim.s#558 -- code Note: according to AMD 64 calling conventions, RDI holds the first argument to a function, which for mutex_enter is the pointer to the mutex data structure from mutex_impl.c Question: what happens if pre-emption hits on this path? ---- We were looking at the usage and allocation of turnstiles. Turnstiles, just like many objects of the same type _and_ size are managed by a Kmem "slab cache"-type allocator (for the background: http://www.usenix.org/publications/library/proceedings/bos94/full_papers/bonwick.ps -- original paper, bonwick-slab-allocator-usenix.pdf local copy http://www.ibm.com/developerworks/linux/library/l-linux-slab-allocator/ -- in Linux) Things to spot: 1. static data structures allocated on boot, such as the first process' proc struct p0 are found in http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/conf/param.c 2. Hash table structures for collections of objects -- but not objects themselsves -- such as the hash array and collision lists ("chains") are initialized at boot time (e.g., kern_setup1): http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/ia32/os/sundep.c#197 which in turn calls thread_init: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/disp/thread.c#166 In which we find the creations of the turnstile_cache (l. 209) and then the allocation of the first turnstile object in the all-ancestor thread (l. 233). This is boot-time run-once code. Finally, in the same file in thread_create new turnstiles are allocated (l. 345). --- Turnstiles are acquired with turnstile_lookup *with the associated mutex's pointer used as a hash key* and released in turnstile_exit, which at the same time acquire and release the high priority dispatcher lock over the hash table's collision list: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/turnstile.c#turnstile_lookup These two functions are used as "balanced parentheses" in the code (e.g., mutex_vector_enter) and must be balanced on each code path. The innards of the hash table: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/turnstile.c#160 -- 174. Also see the comment above it: /* In general, when two turnstile locks 156 * must be held at the same time, the lock order must be the address order. 157 * Therefore, to prevent deadlock in turnstile_pi_waive(), we must ensure 158 * that upimutextab[] locks *always* hash to lower addresses than any 159 * other locks. You think this is cheesy? Let's see you do better. 160 */