|
File Answers.html
Author McKeeman
Copyright © 2007
X
-
A Discipline of Programming, E.W. Dijkstra
-
`FILE: avg.x
z := (x+y)/2.0
-
`FILE: testAvg.x
av := avg := 1.0,2.0;
if av<=(1.0+2.0)/2.0 & av>=(1.0+2.0)/2.0? fi
-
- real, integer and boolean computation
- assignment
- strong typing
- type inference
- conditional execution
- repetitive execution
- subprograms
- recursion
-
- declarations
- data structures
- explicit input/output
- variable scope
- threads
-
Any program is a subprogram.
A special assignment syntax maps the actual arguments
onto the formal inputs and outputs of the subprogram.
Grammars
- Define the term Context-free Grammar.
See page 4.
-
D
N N
N P P P P P
P P P P P P P P P P
N
5
P
1
-
D='5' '5' | '5' '1' '1' '1' '1' '1' | '1' '1' '1' '1' '1' '1' '1' '1' '1' '1' ;
-
D = '5'2 | '5' '1'5 | '1' 10;
- Define the term Finite Automaton in terms of a CFG.
See page 3.
-
S =
P = S 1
P2 = P 1
P3 = P2 1
P4 = P3 1
P5 = P4 1
P5 = S 5
P6 = P5 1
P7 = P6 1
P8 = P7 1
P9 = P8 1
D = P5 5
D = P9 1
-
The CFG determines the syntax tree.
The relation between operand(s) and operator is explicit in the tree.
Lexing
-
The sequence of input characters is broken into a stream of
lexemes representing the symbols in VI
of the CFG.
-
The sequence of lexemes is pruned to remove insignificant
data, such as whitespace and comments; The remaining lexemes
become tokens by gathering information about what kind of symbol
they represent and where they are in the input text.
-
The lexing can be done in one pass which is cache-friendly.
The conventional lexer steps in and out, leaving its memory image
vulnerable to pre-emption.
Parsing
-
The CFG describes the semantics of the language.
The IOG is a template for constructing recursive parsers.
-
As a consequence of our preference for left-to-right associativity,
the CFG is left-recursive in the expression rules.
The rules need to be identified (append rule number to each rule
as a member of VO). Turn left recursion into
regular expression interation (apply formal rules).
-
The LR(1) table builder produces the shift/reduce sequence
which is exactly what the recursive parser does with the IOG.
In fact, historically, recursive parsers produced any old kind
of intermediate representation that suited the writer. The IOG
is a way to bring top-down and bottom-up into the same framework.
-
The parser produces a shift/reduce sequence.
Trees
-
The syntax tree is built by stacking each token and building
a node for each rule in the shift/reduce sequence.
An AST is built in xcom by ignoring some rules.
-
An abstract syntax tree is a subtree of the syntax tree.
The simplest view is that an AST is everything that is needed,
and nothing more.
The AST has no chains,
and may depend on the operators being implicitly defined.
See page 4.
-
program
/ \
stmts eof
|
stmt
|
assignment
/ | \
vars := exprs
| |
var expr
| |
id disjunction
|
conjunction
|
negation
|
relation
|
sum
|
term
|
factor
|
integer
-
program
/ \
stmts eof
|
stmt
|
assignment
/ | \
vars := exprs
| \
id sum
/ | \
/ - \
/ \
/ \
/ \
term term
/ | \ / | \
id * integer integer / id
-
Tree transformations are usually required for optimization.
Furthermore, "decorating" trees with information is sometimes
simpler than building associated tables for the same purpose.
Symbols
-
The use of symbols in a program needs to be consistent
(i.e. strong typing) and whatever attributes they have must
be available for use by later stages of compilation.
In most programming languages, the symbol table is more elaborate
than the one in xcom, dealing in a rich set of attributes and
name scopes.
-
- lookup: given a name, get the symbol table index
- enter: put a name into the symbol table
- getName: given index, get name
- getType: given index, get type
- getUse: given index get left/right usage
- getSubr: given index, find out if it is a subprogram
- getFun: the name of the current function
- getLftCt: the output count for current function
- getLeft: get the index of the ith output
- getRgtCt: the input count for the current function
- getRight: get the index of the ith input
- size: get size of the symbol table (frame)
-
Both contain an entry for each variable.
-
Variable x not fully defined.
-
Integer type not allowed by context.
Generator
-
It puts execution actions in the right order.
-
In xcom the generator walks the tree
and calls the emitter for each action.
It holds output from one emitter in a local variable
to pass to another emitter.
These values are opaque to the generator
to keep it target independent.
Emitter
-
The emitter selects code to turn semantic actions
into hadrware instruction sequences.
Assembler
-
The assembler formats machine instructions into executable bits.
It also provides for entering execution since the bits are local
to the asembler.
-
Using a C pseudo code (and therefore C arithmetic):
- addRR EAX,ECX
EAX += ECX;
- faddp
FPS(sp+1) += FPS(sp); sp -= 1;
- movMR x,EAX
*(ESI+offset(x)) = EAX;
- movRM EAX,y
EAX = *(ESI+offset(y));
- bge n
if (flags('>=')) EIP += n;
Emulator
-
The emulator allows instruction-by-instruction examination
of the effect of emitted code.
-
Each instruction is obeyed as it is encountered.
The ith execution puts trace output in the
ith element of a cell array.
If a negative step size is given,
the previously recorded trace is played out.
Stepping forward just plays out the cell array until
new territory is reached,
in which case the cell array is again extended.
Optimization
-
A more efficient code sequence must behave exactly like the
one it replaced (except for performance).
-
x:=1+2 compiled as x:=3
-
x := y + z; w := y + z;
compiled as
tmp := y+z; x := tmp; w := tmp;
-
do x<10? x := y + a + b; od
compiled as
tmp := a+b; do x<10? x := y + tmp; od
-
Because the value of i will still in a register,
so it does not need to be loaded again for the +1.
Runtime
-
The runtime supplies services supporting X program execution.
It is written in MEX.
-
-
The frame is an int32 MATLAB array.
It holds 1-bit logicals, 32-bit ints and 32-bit singles.
xcom keeps track of which is which in the symbol table.
frame(i) is the bits for the i-th variable.
The puns f2i and i2f are used to make the values work in M.
-
The frame is allocated and initialized with input values
(from the command line) before the subprogram is called.
After execution, the output values are extracted from the
frame and displayed on the command line.
Then the frame is discarded.
-
Subprograms use the i/o mechanism (above) to pass parameters
and return values.
Both frames must be available during the link.
| |