---------------------[ Memory ordering and barriers ]--------------------- Looking through mutex_exit() implementation, you may have noticed that mutex_exit() does not contain any atomic instructions; there is neither the LOCK prefix nor any natively atomic/bus-locking instructions. 698 ENTRY(mutex_exit) 699 mutex_exit_critical_start: /* If interrupted, restart here */ 700 movq %gs:CPU_THREAD, %rdx // current thread ptr 701 cmpq %rdx, (%rdi) // NOT atomic, no LOCK 702 jne mutex_vector_exit /* wrong type or wrong owner */ 703 movq $0, (%rdi) /* clear owner AND lock */ 704 .mutex_exit_critical_end: 705 .mutex_exit_lockstat_patch_point: 706 ret But aren't synchronization primitives supposed to be atomic? Modern processors may reorder memory loads and stores quite liberally (see links on memory reordering below), and the results of two CPUs accessing the lock's opaque memory on a multiprocessor system may be a surprise. Indeed, the comment at line 63 http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/os/mutex.c#63 explains the problem: we might block on a mutex that's just been released! The comment goes on to explain that the problem can be solved without an atomic instruction in mutex_exit(), and how it was solved. Note line 124: 124 * It has been verified by exhaustive simulation that all possible global 125 * memory orderings of (2M) interleaved with (3M) result in correct 126 * behavior. ... This is quite amazing and contrary to every textbook example, but that's how it works! This explanation mentions memory bars (restrictions on certain memory operation reorderings). More on these: ---------------------[ On memory reordering & membars ]--------------------- A short and simple explanation: https://www.ibm.com/support/knowledgecenter/linuxonibm/liaaw/ordering.2006.03.13a.pdf An in-depth explanation (you should read this before you graduate!): https://www.akkadia.org/drepper/cpumemory.pdf Linux kernel doc on memory ordering & memory bars: https://www.kernel.org/doc/Documentation/memory-barriers.txt ---------------------[ A side-note on preemption ]--------------------- So we can do without a LOCK. But what about pre-emption? That is, what if an interrupt hits the CPU on which mutex_exit() is running before the movq on line 703, which clears the lock? Note the labels mutex_exit_critical_start and mutex_exit_critical_end. Comment at http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/intel/ia32/ml/lock_prim.s#512 explains how they are used to avoid this race condition without disabling interrupts: the interrupt handler, if invoked, will check whether the program counter is between these labeled addresses, and adjust the PC back to the beginning of mutex_exit(): http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/intr.c#1495 ---------------------[ Misc links ]--------------------- Discussion of systemd design failures: http://ewontfix.com/14/ , http://ewontfix.com/15/ Example of a leak in file descriptors at close() due to thread cancellation: http://ewontfix.com/2/ , http://ewontfix.com/4/