========================[ Readings ]========================

Read Chapters 1--3 of Ruby Under the Microscope.  Pay special attention
to cut-ins that discuss the source code of Ruby implementation. Follow
the book's discussion with the code open in your favorite editor.

Your objective is to see how Ruby code gets compiled to bytecode from
s-expressions and how the bytecode gets evaluated by the virtual
machine, and to notice patterns of recursion, handling different
scope, and creating closures of functions and blocks over lexical
scopes.

It also helps to note the tricks with which master programmers like
Yukihiro Matsumoto organize their C code. You will notice plenty of
macros that treat C pointers as objects or symbols, creating a
de-facto type system and a macro language not unlike LISP. In effect,
a project such as Ruby creates a new language on top of C---deliberately
or unwittingly. In the latter case, as the joke goes,

  Any sufficiently complicated C or Fortran program contains an ad
  hoc, informally-specified, bug-ridden, slow implementation of half
  of Common Lisp.
   -- https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

In fact, many PL researchers chose to simply implement their ideas in
Scheme :)

Compare the style of Ruby implementation with that of the Emacs
Elisp's virtual machine:
https://github.com/emacs-mirror/emacs/blob/master/src/bytecode.c

Read ahead: Chapters 4 and 5.

-------[ Getting Ruby sources & setting up browsing ]-------

Get the sources from https://github.com/ruby/ruby .

I use the ctags and etags programs to browse code in Vim and Emacs
respectively. Both editors let you jump around the tags and return to
the places in the code you detoured from to look up a definition of a
macro or a function. "ctags" is a separate package under MacPorts,
etags is a part of the "emacs" package. The "man ctags" page's
"HOW TO" section gives a brief intro for both editors.

Running ctags with different options is explained here:
http://arjanvandergaag.nl/blog/combining-vim-and-ctags.html

I use Vim and "view" a lot, usually as follows: "alias v=view +'syn
on'", then "v -t function_var_or_macro_name" from command line to find
a tag.  From within Vim/view, pressing Ctrl+] on a name and Ctrl+t to
get back work best for me, but there are many others:
http://vim.wikia.com/wiki/Browsing_programs_with_tags

With Emacs, I use M-. and M-* , occasionally C-u M-. (but I have to
look it up as often as not). If you wonder why "go see a tag" is
M-. and "go back" is "M-*", so do I :) At one point it occurred to me
that a useful mnemonic might be to think of an archery arrow facing
away from you, with . for the tip and * for the fletching. It works
for me :)

To index specific files, do "ctags *.h *.c" or some such. To index
an entire project directory recursively, do "ctags -R ." in it.
Use etags for Emacs :)

Another good command-line tool is cscope. If you prefer using
your own web browser, then OpenGrok + a local webserver are a good bet.
If you find a free GUI that's nice and not heavy like Eclipse and does
nice tab completion (unlike Visual Studio), let me know :)

----------[ Getting the .inc files generated ]-------

Actual code for VM bytecode execution is only generated when you
"make" ruby. In the Ruby's source directory, see README.md; I
had to do "autoconf, then "./configure", then "make". The .inc
files such as vm.inc get made (with Ruby! I had to switch my
default old ruby 1.8 to a newer one for it to succeed) among the
first. You need not wait till the end of the compilation if you
only want to browse code, not build and install ruby2.4.

=================[ Why not just eval the AST?]=================

In principle, an interpreted language could work straight off of the
parsed AST, evaluating it recursively from the top: call the evaluator
function at the top node. If it represents a value, return it. If it
represents some operation on the child nodes, call it recursively to
evaluate the child nodes and replace them with resulting values, then
perform the operation.

Ruby 1.8 did just that, and so did early LISPs. This approach works,
and evaluator functions for it come out naturally recursive & nice;
however, such recursion can be quite heavy, because it essentially
traverses the AST both from the top down and then upward. 

Thus, faster representations of code were eventually devised. Still,
recursive evaluation of the AST is a fundamental programming pattern,
you must master it---because the very same pattern is now used 
by compilers, as Chapter 3 of the Ruby book explains. It's easier
to learn it in LISP, though (also, see the above point about Scheme
long being the language of choice for PL research).

Suggestion: Write such a recursive evaluator for arithmetic
expressions in LISP, for S-expressions in LISP. For example:

(arithm-eval '(+ 2 (* 2 3)))   => 8 
(arithm-eval 1)                => 1
(arithm-eval '(- 5 3))         => 2
  
etc. 

It should be a very simple function with a few cases for actual
arithmetic operations, and writing such a function should come simply
to you. Remember, trees and recursion are made for each other; you
almost never need to write a function that does something for the
whole tree, you only need to handle a node and call the function on its
immediate children if any---and then take some action on the return 
values if needed. 

Suggestion: write a (recursive) function that takes s-expressions
("sexps", for short) of arithmetic expressions as above, and prints
the corresponding arithmetic expression in the ordinary notation. For
example:

(arithm-print '(+ 2 (* 2 3)))   => 2 + 2 * 3
(arithm-print 1)                => 1
(arithm-print '(- 5 3))         => 5 - 3

etc. Note that this function will be somewhat similar to _flatten_
from OnLisp's Fig. 4.3 if done efficiently---except flatten gets
the symbols in wrong order.

Note that "arithm-read" that would take strings of ordinary
infix-syntax arithmetic operations and make a sexp out of them _with
the proper order of operation precedence_ is a more complex problem
than either of the above. Bison and similar tools solve it with
variations of what's known as the shift-reduce algorithm,
http://blog.reverberate.org/2013/07/ll-and-lr-parsing-demystified.html
(or https://en.wikipedia.org/wiki/Shift-reduce_parser)

Personally, I prefer the Reverse Polish notation; it's a bytecode
ready for execution by a stack machine ( "3 2 * 2 +" , read and
execute left to right; you can think of execution as pushing each
number onto a stack, and, whenever you encounter an operation,
consuming two top tokens from the stack and replacing them with the
op's result).  My first programmable computer was a clone of the Texas
Instruments microcalculator that did just that; the top of the stack
was what showed on the calculator's numeric display. It was glorious :)

Suggestion: write a (recursive) evaluator for a list of tokens
representing a Reverse Polish notation arithmetic expression, e.g.:

(rev-polish-eval '(3 2 * 2 +)) => 8

The "accumulator" argument in this case will hold the current contents
of the stack.

=================[ So why bytecode and JITs? ]=================

Compared with a CPU's linear fetch-decode-execute traversal of its
instruction stream. If any part of the code executes in a tight
loop---which is not infrequent in real programs---the up-and-down cost
of evaluating the AST it paid on every iteration. For the economics
facet of CS, this is too much---and this part happens to pay many
PL researchers' salaries. 

Thus the solution is to push as much of the work needed to "interpret"
the program---i.e., to translate it to faster, simpler operations---up
front, and then, in the tight loops, to only run the results of this
translation. It turns out that the gains of translation can be huge
(and this is, perhaps, one of the most important lessons of PL
practice to date).

This is why the recursive patterns now mostly apply to _compilation_
of ASTs rather than their direct execution. The ASTs are compiled into
simpler-to-execute stack machine bytecode, such as in Elisp, Ruby's
YARV, Python's, etc. The compilation recursively visits all the AST's
nodes as described in the Ch. 3 treatment of compile.c and
iseq_compile_each() on p. 42. This is what the book means when it
speaks of the tight relationship between the compiler and the VM.

The only hard limits here are the CPU speed at its own elementary
instructions, and---most importantly---the limits of knowledge about
values that we have at the translation time vs the run time. For
example, when we make functions the fly, like the lambdas returned by
_made-adder_ and _make-multiplier_, we *cannot* know the state (e.g.,
the values of variables) that these functions would close over---that
state will only be prepared/computed at runtime.

Thus, a major challenge of PL has been to push as much program
interpretation work as possible forward of the tight loops, to
"front-load" it to the limit of knowledge that we cannot have until
actual runtime. We compile as much of a program as possible to either
bytecode or native code, and parametrize this code with what we don't
know. 

This parametrization can take shape of referencing these
unknown-till-runtime values in some table, stack frame, or an
environment struct, where they reside at some known positions or
offsets; it can also take the shape of references-by-name in a hash
table (such as with LISP's global symbol tables for dynamic scoping).
Whatever it is, it's point is separating what we certainly know at
compile time from whatever clues the programmer provides from those
things we cannot possibly know. The bytecodes of VMs such as Elisp's,
Perl's, Python's, Ruby's, and Java's various implementations from the
stack-based JVM to the Android's register-based Dalvik and the newer
ART are really bets in this compile-vs-runtime separation and
parametrization, taken on their designers' intuition.

Other modern environments such Node.js actually de-emphasize the
language clues in favor of just-in-time (JIT) compilation of whatever
code seems to be in most need of it---e.g., it compiles the code or
bytecode snippets that appear to be executed repeatedly, into the
native binary code such as x86. With enough repetition, the cost of
this compilation is amortized; also, its actual execution environment
may feature more constants than variables, and may also exclude some
execution paths, which makes for simpler native assembly. This
approach bets on the runtime being able to better guess where a
program's bottlenecks might be than the programmer; we'll see how this
bet pans out. These JITs are getting very fast; the problem is
debugging them or your code when something goes wrong.

In the meantime, understanding the language's internal VM's bytecode
remains important---and this is what we are trying for with LISP and
Ruby.