;;;; Disassembling lexical binding closures in Elisp ;;; This session explains how Emacs24 implements closures. ;;; There are a couple of tricks to it, but they help ;;; understand the general idea if you look carefully. (setq lexical-binding t) t ;; This function makes lambdas that close over "x", and ;; keep it as a private updatable cell, away from all other code: (defun make-counter (x) (lambda (&optional step) (if (numberp step) (setq x (+ x step)) x))) make-counter (setq c (make-counter 0)) #[256 "\211\247\203 \300\211\242\\\240\207\300\242\207" [(0)] 4 " (fn &optional STEP)"] (setq d (make-counter 100)) #[256 "\211\247\203 \300\211\242\\\240\207\300\242\207" [(100)] 4 " (fn &optional STEP)"] (funcall c) 0 (funcall d) 100 (funcall c 3) 3 (funcall c) 3 (funcall d 1) 101 (funcall d) 101 ;; This could be done even more consicely: (defalias 'c c) c (defalias 'd d) d (c) 3 (d) 101 (d 10) 111 (d) 111 ;; Now, how does this work? (byte-compile #'make-counter) #[257 "\211C\300\301\302\303\304!\305\"\306\307%\207" [make-byte-code 256 "\211\247\203 \300\211\242\\\240\207\300\242\207" vconcat vector [] 4 " (fn &optional STEP)"] 8 " (fn X)"] ;; You can see right away there will be calls to make-byte-code to make new byte-compiled objects at runtime ;; (these will be the returned lambdas) and their cooked-to-order constant vectors (hence vector maniputation ;; functions: vector & vconcat). There is also a line of pre-compiled bytecode for the lambdas, ;; "\211\247..." ;; All these lambdas will have that same bytecode, but different constant vectors associated with each, ;; one per lambda created. ;; So "constant vector" is really a misnomer. It is the *environment* associated with each lambda, ;; having a separate environment is what makes it a "closure", not just a piece of code! ;; More carefully: each call to make-counter will create a byte-compiled object #[...]. Let's say ;; such a returned object gets bound to a function, like c or d. Then, whenever (c) or (d) is called, ;; the object is looked up (it is just its function definition now), and the bytecode string ;; is fed to the VM---together with the constant vector saved in the object. So each ;; call to c evaluates c's bytecode (same as d's), but with the same vector attached to c. ;; You can actually see that VM code in http://cvs.savannah.gnu.org/viewvc/emacs/emacs/src/bytecode.c?view=markup ;; if you search for "Fbyte_code". Arguments to that C function are (bytestr, vector, maxdepth), ;; and vector determines what values opcodes in bytestr actually get to work on. ;; So, again, "constant vector" is really the "environment", saved with the code. Being private ;; to every byte-compiled lambda make by make-counter, it gives each its own variables. ;; This should remind you of stack frames for C functions: the code of the function is the same ;; sequence of x86 opcodes, but each invocation gets its own frame (while it lasts). ;; Vectors are more powerful than frames, through, because they persist between calls! ;; (compare with C's static variables defined inside functions and preserved between calls; ;; think of how they can be compiled!). ;; Just (disassembling #'make-counter) would do after it's been byte-compiled once. (disassemble (byte-compile #'make-counter)) nil ;; Pasted: byte code: doc: ... args: 257 0 dup 1 list1 ; this makes a cons cell around the argument "x", (x). The car of that cons cell will hold "x"'s value! 2 constant make-byte-code ; this will make the byte-code objects, a new one for each call to make-counter 3 constant 256 ; argument descriptor bitmask for the lambda 4 constant "\211\247\203\f^@\300\211\242^B\\\240\207\300\242\207" ; compiled opcodes for lambda 5 constant vconcat 6 constant vector ; this makes a vector around the cons cell, [(x)] . ; That vector will persist, linked from inside #[..] object that represents the closure. 7 stack-ref 5 ; pushes a reference to (x) on top of the stack 8 call 1 ; calls "vector", makes [(x)], the new environment 9 constant [] 10 call 2 ; calls "vconcat", joins the (empty) vector of symbols needed by the lambda. Our lambda happens ; to need none, because numberp is actually a bytecode; but see below for evenp example. 11 constant 4 ; max depth of stack 12 constant "\n\n(fn &optional STEP)" ; doc string 13 call 5 ; call "make-byte-code" with 5 args: descriptor, opcodes, vector, max depth, doc string 14 return ; that's it: a new #[..] object is made, new environment is "married" to opcodes, return it. ;; Now, what's that lambda opcode string? Let's repeat that call to make-byte-code and disasm: ;; the function c made previously by make-counter is just a byte-compiled object. We can disasm it! (disassemble #'c) nil ;;; This is what's in our lambda. How does it handle its internal cell for captured "x"? ;; pasted: byte code: doc: ... args: 256 0 dup 1 numberp ; test via a special opcode 2 goto-if-nil 1 5 constant (3) ; push a new cons cell---opcode gets it from inside the vector. You see 3 now, because ; that's what in its vector now, but see below! 6 dup 7 car-safe 8 stack-ref 2 9 plus 10 setcar 11 return 12:1 constant (3) 13 car-safe 14 return (c 5) 8 (disassemble #'c) nil ;; pasted: byte code for c: doc: ... args: 256 0 dup 1 numberp 2 goto-if-nil 1 5 constant (8) ; now the disassemnler sees 8. It just shows what's there at the moment; what else can it do? ; but the car of the cons attached to the vector attached to c's bound #[..] can change & does. ; In essence, that car is the "slot" for the private variable "x" closed into the lambda! 6 dup 7 car-safe ; gets current value of "x" from the cons cell 8 stack-ref 2 9 plus ; adds "step" argument 10 setcar ; saves the new "x" value into the car of the cons cell! Next time it'll be picked up by opcode ; (and by the disassembler) 11 return 12:1 constant (8) ; gets the current value of "x" 13 car-safe 14 return ; and returns it. ;; Same deal with d and its private copy of "x" linked to cons, to vector, to d's #[..] object (disassemble #'d) nil ;; pasted: byte code for d: doc: ... args: 256 0 dup 1 numberp 2 goto-if-nil 1 5 constant (111) 6 dup 7 car-safe 8 stack-ref 2 9 plus 10 setcar 11 return 12:1 constant (111) 13 car-safe 14 return ;; This is how closures got implemented in Emacs24! The key is the linkage ;; between the code and environmnt; the so-called constants vector, which can ;; point to things with non-constant insides, like a cons cell! ;;;--------------------------------------------------------------- ;; Forcing some symbols to appear in the constants vector. This is ;; a bit contrived, but shows where they come in: (defun make-counter (x) (lambda (&optional step) (if (and (numberp step) (evenp step)) (setq x (+ x step)) x))) make-counter (disassemble (byte-compile #'make-counter)) #[257 "\211C\300\301\302\303\304!\305\"\306\307%\207" [make-byte-code 256 "\211\247\203\301!\203\300\211\242\\\240\207\300\242\207" vconcat vector [evenp] 4 " ; ^^^^^ (fn &optional STEP)"] 8 " (fn X)"]