==========================[ Readings ]========================== Complete the readings of OnLisp chapters 2--5. UPDATED: Read chapters 2 and 4.2 of the Mitchell textbook about the history of modeling computation and the place of Lambda calculus in it. ===============[ Dynamic and static scope in Emacs ]=============== Emacs' LISP, Elisp, by default implements _dynamic_ scoping, just as various other early LISPs did (the textbook mentions this in sec. 3.4, under "Static and Dynamic Scope"). Even though dynamic scoping is easier to implement, lexical scoping turns out to be a lot more interesting---and subsequent LISPs made it default, as it is in Common Lisp (GCL, SBCL, Clozure Lisp), and in Scheme (which heavily modernized LISP, and was to programming language research what Haskell is now). So Emacs added lexical scoping since version 24. Still, both dynamic and lexical scoping are important to understand down to their implementations. ------------[ How dynamic scoping is typically implemented ]------------ Dynamic scoping essentially relies on a global symbol table in which symbols (primarily variables and functions) can be looked up by name. When a local binding for a symbol is desired (say, within a LET or for parameters of LAMBDA or DEFUN), the global table's entry for that name is treated as a _stack_: the new structure representing a symbol is pushed on top of the stack, and it's that structure that looking up the symbol by its name returns. It is said that the new binding _shadows_ the older ones. When the LET or LAMBDA are done, that binding is popped off the stack, and the next look-up by name for the symbol will return the previous structure, i.e., its previous binding. LETs and LAMBDAs can next deeply, burying the global binding (if any) under local bindings, but eventually they all get popped. What's most important, though, is that under this dynamic scoping scheme, a name is always looked up afresh at the point when and where it is referenced. The symbol's value will be always what's "fresh" on top of the stack---unlike lexical scoping with closures, you can't squirrel away the value of a symbol or function _at the time it was created_; you get to look up whatever that value is at the time your code referencing it is _executed_ (if any). This may or may not be what you want---and it's therefore important to know what your language does. ------------[ How dynamic scoping is implemented in Emacs ]------------ From Emacs' documentation: https://www.gnu.org/software/emacs/manual/html_node/elisp/Dynamic-Binding.html "Dynamic binding is implemented in Emacs Lisp in a simple way. Each symbol has a value cell, which specifies its current dynamic value (or absence of value). See Symbol Components. When a symbol is given a dynamic local binding, Emacs records the contents of the value cell (or absence thereof) in a stack, and stores the new local value in the value cell. When the binding construct finishes executing, Emacs pops the old value off the stack, and puts it in the value cell." (Read the above page for examples) ------------[ Seeing it in Emacs bytecode ]------------ Emacs compiles its Elisp to bytecode. This bytecode is executed by a simple stack machine. This is similar to how Java programs are run: they are first compiled to bytecode for the JVM, and then executed by the JVM, which is also stack-based (but _not_ for the Android platform; there, the virtual machine is register-based and thus resembles a typical CPU like x86 or ARM). The Emacs bytecode and the Emacs stack machine are described in http://nullprogram.com/blog/2014/01/04/ Quoting from it: -----------------------[ begin quote ]----------------------- People do not write byte-code; that job is left to the byte compiler. But we provide a disassembler to satisfy a cat-like curiosity. -----------------------[ end quote ]----------------------- As you may have guessed already, cat-like curiosity is the recurrent theme of this class :) The two bytecode instructions essential for dynamic binding are _varbind_ and _varref_. _varbind_ creates and pushes a new symbol binding for the named variable (it's always named, because dynamic scoping operates with name lookups). _varref_ looks up the current binding and pushes its value on top of the virtual machine's stack (completely separate from stacks of bindings hanging off of the global symbol table). Notice that byte-compiled Elisp functions under dynamic scoping always carry their names with them, because look-ups of values are based on names, even in the bytecode VM internals. ----------------[ Lexical scoping in Elisp ]----------------- Under lexical scoping, LAMBDAs and DEFUNs "close" over the values of symbols referenced at the time of their creation. The cons cells and other memory structures representing the values referenced by symbols within the body of a function at the moment of its creation will be kept around, and it's these structures that that memory that you will get when you execute the function. Naturally, this requires a very different implementation, which is no longer based on names and name lookups, but rather on pointers and offsets (and is thus much closer to compiled C). Here is how lexical scoping changed Elisp: http://prog-elisp.blogspot.com/2012/05/lexical-scope.html In particular, the names of variables a LAMBDA or function are closed over are no longer saved with the compiled code. The _stack-ref_ bytecode is used to access them rather than _varref_. This makes perfect sense---by the time the code is called, the names may have completely different bindings, but only the bindings at the time the code is created actually matter. Lexical scoping need not depend on any global structures like the symbol table. That property turns out to make it very interesting, because even without such a dependence---and without any global names---it turns out to be possible to implement not just "objects" with state, but also recursion. ===============[ Making stateful objects with LAMBDAs ]=============== First, note that if we didn't have LET, we could implement it via LAMBDAs. For example, (LET ((x 10)) (+ 1 x)) could be ((LAMBDA (x) (+ 1 x)) 10) or even ((LAMBDA (x) (progn (setq x (+ x 1)) x)) 10) and neither would affect the value of the global X or any X outside of this code, because it closes over the LET's local binding or the LAMBDA's argument list respectively. Moreover, these should work the same regardless of whether scoping is dynamic or lexical. However, the following is only possible with _lexical_ scoping. The make-adderb function described on p. 19 of the OnLisp book note only makes and returns a function, but also captures its internal state, and allows it to be changed when called with two arguments, the second of which is T. Observe this in http://www.cs.dartmouth.edu/~sergey/cs59/lisp/emacs-dynamic-and-static-bindings-more.txt ===============[ Lambda calculus, Y-combinator, recursion ]=============== You may have noticed that LAMBDA functions do not have names by which they can be called inside their bodies. This seems to imply that anonymous functions created by LAMBDAs _cannot be called recursively_ (unlike those created by DEFUNs, which require a name and can use it, but also explicitly save that name in the global symbol table via SETF, which acts similarly to SETQ, but is more general). It turns out, though, that recursion *is* possible with just pure LAMBDAs, without any global symbol tables. This is a deep theoretical result, and the reason that "Lambda calculus" is taught as the foundations of CS, and that the "Y-combinator" is one of the most famous theory constructs in CS. It may, in fact, be the reason why we have LAMBDAs all over modern programming languages :) Read chapter 2 for an intro to what Lambda calculus is. Do not feel discouraged by the formalism---it is nothing but mathematically recording the most basic idea and intuitive idea that a function is a formula written for substituting some if its letters ("formal arguments") with other formulas to which the function is applied. This is the idea that powers all of algebra: functions are formulas designed for "plugging in" values. At the most basic, it's all about mechanically rewriting formulas so long as any rules for rewriting apply, possibly forever. In computing, this is known as Symbolic Computation. LISP was designed to automate this process. Wikipedia's https://en.wikipedia.org/wiki/Fixed-point_combinator#Fixed_point_combinators_in_lambda_calculus shows one step of the (endless) expansion of the Y-combinator applied to a function. This fundamental CS formula is *only* about variable substitution---one can understand it with just a basic algebraic rewriting/substitution intuition, without any of the formal Lambda calculus. Work through the implementation of Y in Common Lisp: https://rosettacode.org/wiki/Y_combinator#Common_Lisp It's not as wonderfully symmetric as the theoretical notation or Scheme's (https://rosettacode.org/wiki/Y_combinator#Scheme). This is due to LISP's separation of a symbol's "value" and "function" bindings, essentially creating two pointer slots in the structs implementing a symbol where Scheme uses one. As a result, Common Lisp needs FUNCALL, APPLY, and &REST to call a symbol bound to a LAMBDA as a function on a list of arguments: ;; from https://rosettacode.org/wiki/Y_combinator#Common_Lisp (defun fac (f) (lambda (n) (if (zerop n) 1 (* n (funcall f (1- n)))))) (defun Y (f) ((lambda (x) (funcall x x)) (lambda (y) (funcall f (lambda (&rest args) (apply (funcall y y) args)))))) (funcall (Y #'fac) 5) 120 (mapcar (Y #'fac) '(1 2 3 4 5 6 7 8 9 10)) (1 2 6 24 120 720 5040 40320 362880 3628800) Scheme, by contrast, has only one slot for a symbol's binding, and uses that LAMBDA for a call when the symbol appears first in an evaluated list: ;; from https://rosettacode.org/wiki/Y_combinator#Scheme (define Y (lambda (h) ((lambda (x) (x x)) (lambda (g) (h (lambda args (apply (g g) args)))))))