|
File Symbols.html Author McKeeman Copyright © 2007 index Symbol Tables and TypeFrames in xcom
A symbol table is a block of name/attribute pairs. The symbol table is built during analysis. The runtime frame is allocated just before execution begins. Just to keep things simple in xcom, the entries in the two storage blocks correspond one-to-one. Suppose the variables are named B, I and R.
Actually, things are slightly more complicated. Temporary variables are allocated during code generation. Places for them are tacked on to the end of the runtime frame but there is no need to put them in the symbol table.
The Symbol TableThe symbol table is built by walking the syntax tree. As each variable is encountered at a leaf node, it is looked up in the symbol table, and entered if not there. When an operation is encountered, its type signature is used to refine the types in the symbol table. Here is a worked example. In a production compiler, symbol table lookup can be a performance bottleneck, and is therefore normally uses a hash-table lookup. In xcom a simple linear lookup is used. ScopeMany programming languages have a concept of scope. A scope has a beginning and an end. A variable can be seen in its scope and not otherwise. Scopes often can be nested leading to rules resolving the interpretation of a single name occurring in more than one level of nesting. In X there is one scope per subprogram. All variables are local. Values are passed in and out via parameters. Subprogram names (listed on the xcom command line) are global. Symbol AttributesType and use information is gathered for each variable. The information comes from the syntax tree. A symbol table is constructed (it is initially empty). The information in the tree does not change (read-only) but the symbol table gradually acquires more information during type analysis. The X language is strongly typed, meaning that all operations require type-correctness. Some operations, such as '+', are overloaded; real can be added to real and integer can be added to integer but mixed type addition is not allowed. The initial entry for a symbol necessarily records all types as possible since nothing is yet known about the variable. If the type of a variable has not yet been fully determined, and a usage is found that requires a more specific set of types, then the other possibilities are removed as attributes for that variable in the table. In expression processing, a node (always representing an X operand) may have a known type because of its use (for example, the expression argument to r2i must have type real). This type requirement is carried down the tree as it is walked. If there is an intact requirement when a variable is reached (at a leaf), that requirement is recorded in the symbol table for that variable. Subsequent references to that variable elsewhere in the tree use the recorded information. Thus the type information hops from leaf to leaf as well as travelling up and down the branches. Suppose, for example, the subexpression is r2i(x+y). The node for the parameter is forced to type real. When the node for addition is reached, its result is forced to type real, which forces both of its operands to type real. When the leaves are reached, both are then also forced to type real and the type is recorded in the symbol table for each of them. The result of r2i(x+y) is forced to type integer and passed back up the tree. If a type conflict is encountered (meaning that there are no more remaining possibilities), or the type is still not defined after all information has been gathered, a diagnostic is issued and xcom exits. The use of the variable is also recorded. If the variable is on the left of assignment, the address is needed. If it is used in an expression, the value is needed. The names given to these properties are left and right. A variable may have either or both such attributes. Type InferenceSome type-determining situations do not give complete information. In X these are the opportunities and what can be recorded, based on the signatures of the xcom operators. The result types are passed root-ward; the operand types are passed leaf-ward. These are actions represent the inherited and synthesized attributes. Knuth has written a nice history of how attribute grammars developed and eventually entered into the literature.
An xcom assignment insures the types of the left and right hand sides of the assignment are the same. But, if neither left nor right type is known at the moment of examination, the correct type cannot be entered into the symbol table. It may be that, after walking the syntax tree once, some variables do not yet have fully defined types. A sequence of assignments, each leaving the types equal but undefined, is the worst case. The symbol table information is refined by additional passes over the syntax tree extending the type-effect of an assignment one step at a time. The symbol table implementation, therefore, repeatedly walks the syntax tree until no further type changes are found. Some small modifications of the depth-first tree walk are implemented to improve the efficiency of type analysis. More could be done. Languages, such as C, with declare-before-use do not need multiple passes to build the symbol table because the type information is already fully determined. |