Growing a Compiler

by Bill McKeeman and Lu He

MathWorks and Dartmouth, May 2009



Self-compiling compilers are common. The question is: How far can one go, bootstrapping a (very) small compiler-compiler into more capable compilers?

Context-free grammars are extended to accomodate output. A grammar executing machine (GEM) is introduced which accepts an input text and a grammar, and outputs another text. Both the input text and the output text can also be grammars, permitting the production of ever more powerful grammars. GEM itself can be extended to build-in the capabilities of the previous grammars. The rules of the game require that changing GEM does not add to its original capability -- it merely makes the implementation more robust or faster.

The grammars and the machine have some simple symmetries that lead to actions such as backtracking and decompiling. It is also possible to directly execute bit-strings in the Intel x86 hardware.


  1. Base GEM
    • statement of the problem
    • executable grammars
    • simple examples
  2. Robust GEM
    • pre-entered character classes
    • using nowhite, pretty, invert
  3. GEM with builtin nowhite and chars
    • using multi-character input and output symbols
    • left-associative arithmetic expressions
    • X86 floating point stack
  4. GEM with builtin multichar symbols
    • using Kleene * and + in executable grammars
  5. Running Intel X86 code
    • X86 Assembler
    • calculator
    • atoi
  6. Plenty Phrase Names
    • BNF
    • self
    • pretty


The origin of the idea is a undergraduate thesis (UC Santa Cruz, 1978) written by Doug Michels under the supervsion of Bill McKeeman.

The title is inspired by: Guy Steele's 1998 OOPSLA talk Growing a Language.

Thanks to Steve Johnson for critical advice in the preparation of this presentation.

The default font sizes in Firefox are uncomfortably large for this paper. Try [view][edit][zoom][text only][zoom out][zoom out].



An earlier version was presented to the Computer Science Colloquium, Stanford, March 4, 2009