File Answers.html    Author McKeeman    Copyright © 2007

X

  1. A Discipline of Programming, E.W. Dijkstra
  2. `FILE: avg.x
    z := (x+y)/2.0
          
  3.         
    `FILE: testAvg.x
    av := avg := 1.0,2.0; 
    if av<=(1.0+2.0)/2.0 & av>=(1.0+2.0)/2.0? fi
          
    1. real, integer and boolean computation
    2. assignment
    3. strong typing
    4. type inference
    5. conditional execution
    6. repetitive execution
    7. subprograms
    8. recursion
    1. declarations
    2. data structures
    3. explicit input/output
    4. variable scope
    5. threads
  4. Any program is a subprogram. A special assignment syntax maps the actual arguments onto the formal inputs and outputs of the subprogram.

Grammars

  1. Define the term Context-free Grammar. See page 4.
  2. D
       N N
       N P P P P P
       P P P P P P P P P P
    N
       5
    P
       1

  3. D='5' '5' | '5' '1' '1' '1' '1' '1' | '1' '1' '1' '1' '1' '1' '1' '1' '1' '1' ;

  4. D = '5'2 | '5' '1'5 | '1' 10;

  5. Define the term Finite Automaton in terms of a CFG. See page 3.
  6. S =
    P = S 1
    P2 = P 1
    P3 = P2 1
    P4 = P3 1
    P5 = P4 1
    P5 = S 5
    P6 = P5 1
    P7 = P6 1
    P8 = P7 1
    P9 = P8 1
    D = P5 5
    D = P9 1

  7. The CFG determines the syntax tree. The relation between operand(s) and operator is explicit in the tree.

Lexing

  1. The sequence of input characters is broken into a stream of lexemes representing the symbols in VI of the CFG.
  2. The sequence of lexemes is pruned to remove insignificant data, such as whitespace and comments; The remaining lexemes become tokens by gathering information about what kind of symbol they represent and where they are in the input text.
  3. The lexing can be done in one pass which is cache-friendly. The conventional lexer steps in and out, leaving its memory image vulnerable to pre-emption.

Parsing

  1. The CFG describes the semantics of the language. The IOG is a template for constructing recursive parsers.
  2. As a consequence of our preference for left-to-right associativity, the CFG is left-recursive in the expression rules. The rules need to be identified (append rule number to each rule as a member of VO). Turn left recursion into regular expression interation (apply formal rules).
  3. The LR(1) table builder produces the shift/reduce sequence which is exactly what the recursive parser does with the IOG. In fact, historically, recursive parsers produced any old kind of intermediate representation that suited the writer. The IOG is a way to bring top-down and bottom-up into the same framework.
  4. The parser produces a shift/reduce sequence.

Trees

  1. The syntax tree is built by stacking each token and building a node for each rule in the shift/reduce sequence. An AST is built in xcom by ignoring some rules.
  2. An abstract syntax tree is a subtree of the syntax tree. The simplest view is that an AST is everything that is needed, and nothing more. The AST has no chains, and may depend on the operators being implicitly defined. See page 4.
  3.         program
             /   \
          stmts  eof
            |
           stmt
            |
        assignment
        /   |    \
      vars  :=   exprs
       |          |
      var        expr
       |          |
      id      disjunction
                  |
              conjunction
                  |
               negation
                  |
               relation
                  |
                 sum
                  |
                term
                  |
               factor
                  |
               integer
          
  4.         program
             /   \
          stmts  eof
            |
           stmt
            |
        assignment
        /   |     \
      vars  :=    exprs
       |             \
      id              sum
                    /  |  \
                   /   -   \
                  /         \
                 /           \
                /             \
            term               term
           /  |  \            /  |  \
         id   * integer  integer /   id 
          
  5. Tree transformations are usually required for optimization. Furthermore, "decorating" trees with information is sometimes simpler than building associated tables for the same purpose.

Symbols

  1. The use of symbols in a program needs to be consistent (i.e. strong typing) and whatever attributes they have must be available for use by later stages of compilation. In most programming languages, the symbol table is more elaborate than the one in xcom, dealing in a rich set of attributes and name scopes.
    • lookup: given a name, get the symbol table index
    • enter: put a name into the symbol table
    • getName: given index, get name
    • getType: given index, get type
    • getUse: given index get left/right usage
    • getSubr: given index, find out if it is a subprogram
    • getFun: the name of the current function
    • getLftCt: the output count for current function
    • getLeft: get the index of the ith output
    • getRgtCt: the input count for the current function
    • getRight: get the index of the ith input
    • size: get size of the symbol table (frame)
  2. Both contain an entry for each variable.
  3. Variable x not fully defined.
  4. Integer type not allowed by context.

Generator

  1. It puts execution actions in the right order.
  2. In xcom the generator walks the tree and calls the emitter for each action. It holds output from one emitter in a local variable to pass to another emitter. These values are opaque to the generator to keep it target independent.

Emitter

  1. The emitter selects code to turn semantic actions into hadrware instruction sequences.

Assembler

  1. The assembler formats machine instructions into executable bits. It also provides for entering execution since the bits are local to the asembler.
  2. Using a C pseudo code (and therefore C arithmetic):
    1. addRR EAX,ECX
      EAX += ECX;
    2. faddp
      FPS(sp+1) += FPS(sp); sp -= 1;
    3. movMR x,EAX
      *(ESI+offset(x)) = EAX;
    4. movRM EAX,y
      EAX = *(ESI+offset(y));
    5. bge n
      if (flags('>=')) EIP += n;

Emulator

  1. The emulator allows instruction-by-instruction examination of the effect of emitted code.
  2. Each instruction is obeyed as it is encountered. The ith execution puts trace output in the ith element of a cell array. If a negative step size is given, the previously recorded trace is played out. Stepping forward just plays out the cell array until new territory is reached, in which case the cell array is again extended.

Optimization

  1. A more efficient code sequence must behave exactly like the one it replaced (except for performance).
  2. x:=1+2 compiled as x:=3
  3. x := y + z; w := y + z;
    compiled as
    tmp := y+z; x := tmp; w := tmp;
  4. do x<10? x := y + a + b; od
    compiled as
    tmp := a+b; do x<10? x := y + tmp; od
  5. Because the value of i will still in a register, so it does not need to be loaded again for the +1.

Runtime

  1. The runtime supplies services supporting X program execution. It is written in MEX.
    1. The frame is an int32 MATLAB array. It holds 1-bit logicals, 32-bit ints and 32-bit singles. xcom keeps track of which is which in the symbol table. frame(i) is the bits for the i-th variable. The puns f2i and i2f are used to make the values work in M.
    2. The frame is allocated and initialized with input values (from the command line) before the subprogram is called. After execution, the output values are extracted from the frame and displayed on the command line. Then the frame is discarded.
    3. Subprograms use the i/o mechanism (above) to pass parameters and return values. Both frames must be available during the link.