---------------[ Readings ]--------------- We've covered parts of the Ruby book Chapters 4, 5, 6, and 8. I suggest reading Chapters 4 and 8 carefully, walking through the bytecode to see how lexical scope and closures work in Ruby. For the mechanism of function calls, read J. Coglan's excellent and succinct blogpost https://blog.jcoglan.com/2013/05/08/how-ruby-method-dispatch-works/ and see the add-cdr-to-array.rb example. My short note on what happens with "def" is in http://www.cs.dartmouth.edu/~sergey/cs59/ruby/how-defs-compile.txt It explains the creation and bytecode execution of the simple function in addtwo-bytecode.rb . ----------[ Ruby closures vs Elisp closures internals ]----------- Recall that for Elisp we worked through this in http://www.cs.dartmouth.edu/~sergey/cs59/lisp/elisp-bytecode-lexical-binding-updated.txt Recall that Elisp created a new data stack for every function, and destroyed that stack on return (pushing the function's return value from the top of the stack about to be destroyed to the top of the caller function's stack). These stacks were allocated, via alloca()---which allocates on the stack, not on the heap, and so doesn't require a free()---in the C stack frames of the C function Fbyte_code that interprets bytecode strings (explained in http://www.cs.dartmouth.edu/~sergey/cs59/lisp/reading-elisp-vm-code.txt). Elisp implemented closures by compiling (lambda ...) expressions into code that (a) created a "constants vector" at runtime, (b) created a compiled bytecode object that pointed to that vector, and (c) also pointed to the bytecode string for the body of the lambda. The bytecode of that body referenced its variables through the constants vector, i.e., got and set the values of the saved-away cells, made at the time (a), (b), and (c) were evaluated. So each evaluation of (lambda ...) got its own cells, it own private "environment" to run the bytecode string of the body in. That combined bytecode object was what was bound to whatever got the result of evaluating the (lambda ...), and, whenever called, the "constants vector" provided the environment, and the pre-compiled shared bytecode string provided the logic. In a nutshell, captured variables were copied into the constants vectors, one per call to (lambda ...); vectors lived on the heap. When the compiled bytecode objects binding them went out of scope, vectors were garbage collected at some point after. Ruby, as Chapters 4 and 8 explain, does things differently. It keeps a true (not piggy-backing on C) control stack of rb_frame_control_t structs, and one common data stack shared by all function and lambda invocations (in fact, by all iseq sequences, no matter how executed). These rb_frame_control_t structs look remarkably like Elisp's byte_stack structs (see reading-elisp-vm-code.txt again), but now they form their own stack, and are cross-linked with EP pointers with Ruby's data frames, where local variables and function arguments live. You can see the definition of the rb_frame_control_t in Ruby source code in "vm_core.h": https://github.com/ruby/ruby/blob/trunk/vm_core.h#L630 typedef struct rb_control_frame_struct { const VALUE *pc; /* cfp[0] */ VALUE *sp; /* cfp[1] */ const rb_iseq_t *iseq; /* cfp[2] */ VALUE self; /* cfp[3] / block[0] */ const VALUE *ep; /* cfp[4] / block[1] */ const void *block_code; /* cfp[5] / block[2] */ /* iseq or ifunc */ ... } rb_control_frame_t; I find it easier to download and search a local copy of the ruby-trunk. http://www.cs.dartmouth.edu/~sergey/cs59/ruby-compiler-and-vm-code.txt explains how (search for "Getting Ruby sources & setting up browsing"). This makes memory management easier---no free()s required for the stack frames--- but there's a gotcha. In the following example the variable "str" of the lambda returned by "message_function" lives in the message_function's data stack frame. That frame will be created when message_function is called, and will be popped afterwards, leaving just a pointer to the generated lambda on top of the stack (to be bound to function_value next). How does the lambda, when called for "dog", "cat", and "badger", address that "str"? ------------------- lambda1.rb ------------------- code = <<__CODE__ def message_function str = "The quick brown fox" lambda do |animal| puts "\#{str} jumps over the lazy \#{animal}." end end function_value = message_function # creates frame where str lives function_value.call('dog') # this frame will popped at this point! function_value.call('cat') function_value.call('badger') __CODE__ puts RubyVM::InstructionSequence.compile(code).disasm RubyVM::InstructionSequence.compile(code).eval --------------------------------------------------- The answer (as Chapter 8 explains) is that lexical analysis of Ruby code (think the byte-compiler walking the sexp of the code---the scope information is all there) tells the byte-compiler that "str"'s stack frame will not be there when needed. Ruby knows that lambda's code will need to walk one frame "up" the stack, along the EP chain, to get "str", and will generate "getlocal" and "setlocal" opcodes to do so; but where's "up"? So Ruby will copy that frame from the stack onto the heap, and repoint the EP chain there. That copied frame will need to be garbage-collected eventually, but that is a small price to pay. Figure 8.17 on p. 208 in the book (p. 232 in the PDF) shows this. ---------------[ Mixins ]--------------- A great feature of Ruby is the ability to add methods to any classes or objects on the fly. The example add-cdr-to-array.txt shows this (in it, I add a methods :cdr and :null? to Array, and null? to nil) My transcript from class is in add-cdr-to-array-log.txt. (I added a bunch of examples to what we did in class, to clarify things). The Ruby on Rails web framework depends on Mixins a lot, and so do many other Ruby frameworks. For more detail on what "class" does, see how-mixins-compile.txt