Growing a Compiler
by Bill McKeeman and Lu He
MathWorks and Dartmouth, May 2009
Self-compiling compilers are common. The question is: How far can one go, bootstrapping a (very) small compiler-compiler into more capable compilers?
Context-free grammars are extended to accomodate output. A grammar executing machine (GEM) is introduced which accepts an input text and a grammar, and outputs another text. Both the input text and the output text can also be grammars, permitting the production of ever more powerful grammars. GEM itself can be extended to build-in the capabilities of the previous grammars. The rules of the game require that changing GEM does not add to its original capability -- it merely makes the implementation more robust or faster.
The grammars and the machine have some simple symmetries that lead to actions such as backtracking and decompiling. It is also possible to directly execute bit-strings in the Intel x86 hardware.
- Base GEM
- statement of the problem
- executable grammars
- simple examples
- Robust GEM
- pre-entered character classes
- using nowhite, pretty, invert
with builtin nowhite and chars
- using multi-character input and output symbols
- left-associative arithmetic expressions
- X86 floating point stack
with builtin multichar symbols
- using Kleene * and + in executable grammars
- Running Intel X86 code
- X86 Assembler
- Plenty Phrase Names
The origin of the idea is a undergraduate thesis (UC Santa Cruz, 1978) written by Doug Michels under the supervsion of Bill McKeeman.
The title is inspired by: Guy Steele's 1998 OOPSLA talk Growing a Language.
Thanks to Steve Johnson for critical advice in the preparation of this presentation.
The default font sizes in Firefox are uncomfortably large for this paper. Try [view][zoom][text only][zoom out][zoom out].
- Bill McKeeman Compiler and Compiler Course
- Wikipedia Mealy Machine
- Guy Steele Growing a Language.
- Doug Michels A concise extensible metalanguage for translator implementation
- Bill McKeeman , MathWorks Fellow
- Lu He, Computer Science Department, Dartmouth
An earlier version was presented to the Computer Science Colloquium, Stanford, March 4, 2009