## Table-driven ParsingRead a short history of bottom-up parsing. ## The LR(1) ActionsLR(1) parsing is based on precomputing a data structure that holds
Each shift entry causes the parser to
Each reduce entry represents the recognition of a grammar rule.
The reduce entries are the final states of the DFA.
When a final state is reached, the parser breaks out of the DFA
and executes a non-DFA reduce action.
The reducing rule is ## Coding the LR(1) TableThe LR(1) table contains small integer values. Zero represents a syntax error. A shift entry contains the new state. A reduce entry contains the rule to apply. In this implementation, reduce rules are distinguished from shifts by encoding the negative of the rule number. For example, the simple grammar knuth3.cfgrules numbers G S eof 1 S a A d 2 b A B 3 A c A 4 c 5 B d 6gives rise to the LR(1) transition table a b c d eof A B G S 1: 2 3 0 0 0 0 0 0 4 2: 0 0 5 0 0 6 0 0 0 3: 0 0 7 0 0 8 0 0 0 4: 0 0 0 0 9 0 0 0 0 5: 0 0 5 -5 0 10 0 0 0 6: 0 0 0 11 0 0 0 0 0 7: 0 0 7 -5 -5 12 -5 0 0 8: 0 0 0 13 -7 0 14 0 0 9: -1 -1 -1 -1 -1 -1 -1 -1 -1 10: 0 0 0 -4 0 0 0 0 0 11: 0 0 0 0 -2 0 0 0 0 12: 0 0 0 -4 -4 0 -4 0 0 13: 0 0 0 0 -6 0 0 0 0 14: 0 0 0 0 -3 0 0 0 0and the rule lengths
## Using the LR(1) TablesSuppose the input text is "a c d eof". Then the following actions take place. The application of a reduction takes two steps; one to apply the rule and one to compute the new state.
## Building the LR(1) Tables (operational viewpoint).The LR(1) table construction deals with sets of marked rules, each with a lookahead set. The initial LR(1) set, corresponding to the LR(1) machine start state, contains the goal rule marked in the leftmost position. Its lookahead is empty. For the grammar knuth3 above, that would be the set
{ G =
where the dot is the mark.
^{.}S eof {} }
The construction alternates A state is closed by (recursively) adding marked rules for all definitions of each phrase name immediately to the right of the mark. The newly added rules are always marked at the leftmost position. The lookahead for each new rule is all things that could follow it, which in turn is computed from the heads of its right context. Shifts from a closed state are computed by moving the mark one place to the right. The shift is on the symbol over which the mark moves. This has the effect of separating the marked rules into small groups, each of which is the initial contents of a new state. The shifted rules carry their lookahead along. The collection of states and shifts is a DFA that recognizes all possible parse stacks. If the mark has arrived at the right end of a rule, it signifies a final state of the automaton, and therefore a reduce action. There can be several shifts and several reduce actions for one state. The requirement is that their lookahead sets be disjoint. If they are not, the LR(1) machine contains a shift/reduce conflict or a reduce/reduce conflict.
Read a description of table building. ## Building the LR(1) Tables (mathematical viewpoint).
Read Knuth's 1965 paper,
Footnote: I first learned bottom-up compiling as a research assistant to Niklaus Wirth at Stanford. I attended a presentation of LR(k) by Donald Knuth at Stanford well in advance of the material appearing in print. |