File Generator.html    Author McKeeman    Copyright © 2007    index

Generate Semantic Action Sequences

Overview

The generator walks the IR once, calling the semantic actions in execution sequence, and keeping track of values shared between actions. All state is kept in the local variables of the IR walking routines. The generator has no dependencies on the target CPU, but rather accesses its emitter object as needed.

Expressions

An expression is always represented by a subtree of the syntax tree. The leaves of the subtree are literals, variables and builtins. Since the symbol table is complete before the Generator object is constructed, the type of all variables is known. The type of literal constants and builtins is manifest in their form.

The tree walker does a depth-first left-to-right sweep of the syntax tree, and therefore each expression subtree. The interior nodes of the expression subtree represent computations. By the final visit to an internal node, all the leafward nodes have already been processed. The return value of an expression-walking Generator method is an operand. The operands are opaque to the Generator object, it gets them from calls to the emitter object and passes them back unexamined. Whatever values are going to be needed are stored in the local variables of the walker methods. Here is a worked example.

Builtin Functions

A builtin function can either be inlined (just emit the right code) or implemented in the runtime in C (see getCfun.c). If a builtin is implemented in C, it must be called from the running machine code. The calling code is similar to, but simpler than, the call of an X subprogram. The inputs (if any) are passed on the hardware stack and the result is left in a designated register (the Intel x86 calling sequence).

Implementing a builtin function in C is relatively straightforward. The mechanism is in place in getCfun.c. Just add to it. Note that operators '/' and '//' are implemented as builtins because it was simpler that emitting the proper code sequence.

Control

Conditional and iterative constructs need to be turned into segments of code that are reached via branch logic. A forward branch needs the destination before the depth-first tree walk has reached that point. The solution is for the emitter to make code for the segment, leaving a "hole" in which to put the eventual forward branch distance, and opaquely passing back the location of the hole to the walker, to be held in the local variables of the walking method. Before returning, the walker method passes the collection of places for holes back to the emitter. The emitter then does a fixup, poking the now-known branch destinations into the holes that were left behind.

Subprograms

Calling a subprogram requires allocating a frame, collecting the inputs and placing them in the frame, then passing the frame and transfering control to the subprogram. Once the subprogram has completed, it will have left the results in its frame. The results have to be extracted from the frame, placed in the caller output variables, and the frame freed The location of inputs and outputs in the frame is found in the symbol table.

Prolog and Epilog

The calling sequence requires manipulating the hardware stack. Upon entry, the program must save registers and prepare to access its frame. Upon exit, the program must restore registers and clean up the hardware run stack. These standard code sequences are called the prolog and epilog.