At rock bottom we have the machine itself, which we identify as an interpreter. We are not concerned about what language interpM is written in.
interpC_C, an interpreter of C written in C is probably not directly executable, but we can imagine the task of executing it by hand. Were it executed, it would be run with two arguments, a program (written in C) and an input to the program.
result = interpC_C prog_C dataTo run a program on machine we use the machine-language version of the program
result = interpM prog_M dataprog_M and prog_C are required to give the same results when run on the same data. The requirement implies an equivalence criterion:
interpM prog_M == interpC_C prog_C
prog_M = interpM compCtoM_M Prog_CSubstituting for
interpM interpM compCtoM_M == interpC_CThe inner instance of interpM runs the compiler, the outer instance runs the resulting code.
One example of a program written in C is the compiler itself. It can be built for use on the machine thus:
compCtoM_M = interpM compCtoM_M compCtoM_C
compCCtoM_M = interpM compCtoM_C compCCtoM_CSince CC is an extension of C, a compiler for it written in C can equally well be regarded as being written in CC:
compCCtoM_CC = compCCtoM_CTo confirm that the compiler works we use it to recompile itself.
compCCtoM_M' = interpM compCCtoM_M compCCtoM_CCAlthough code compiled by the recompiled version (distinguished by the ' symbol) should be the same as that compiled by the previous version, the machine-language texts of the two versions themselves can't be expected to be identical, because they were made by different compilers (compCCtoM vs compCtoM). However, if we do another round of recompilation
compCCtoM_M'' = interpM compCCtoM_M' compCCtoM_CWe expect the texts of the ' and '' versions to agree because they were made by the same compiler (compCCtoM). However, one should be aware that it's possible to hide magic in a compiler so it might, for example, update an embedded version number upon recompilation.
p1 y == p2 x0 yFor example the squaring function may be defined to be a specialization of a general integer-power function
pow :: Int Double Double square = pow 2This naive definition, however, is not likely to be more efficient than calling the power function directly. To get a good square, we need the services of a specializer, called mix for historical reasons. Roughly speaking
square = mix pow 2
A program with two inputs would normally be run by an interpreter thus
result = interp prog data1 data2(Our origninal defintion of interpreter allowed only one data argument. That could a tuple, which we show here in curried form.)
The spcializer converts the program to a residual
resid = mix prog data1We can run the residual thus
result = interp resid data2The faithfulness criterion that the specializer needs to respect is that either of these two computations yields the same result, for every possible data2.
result = interp prog data1 data2 result = interp resid data2Upon replacing resid by its definition, and omitting data2 from both sides, the faithfulness criterion becomes
interp prog data1 = interp (mix prog data1)We note that
mix progtakes any chosen data1 and makes a version of prog specialized for that datum. Thus we may call mix prog a particularized specializer.
prog' = mix interp progThe specialized interpreter can be run thus
result = interp prog' data1 data2This doesn't look like we've gotten anywhere: we now run prog' instead of prog; and prog' probably contains some Cheshire grin of interp in addition to prog.
However, if we decorate the formula with what languages are involved at each stage, we see an interesting possibility. prog might have been in some new language L, while the interpreter is written in C, and the specializer is for C. Since prog' is in C, we can call it progC.
progC = mixC interpL_C progLNotice that mixC interpL_C does exactly what a compiler compLtoC_C would have done. Thus we have schematically
compLtoC_C == mixC interpL_CHowever, this compiler is fairly expensive--it requires the interpreter as a hidden input to every compiler run. An input that is present in every run, however, is itself a candidate for specialization.
mix mix interpHere mix mix takes an interpreter as argument. The residual program created by the outer mix is a compiler because mix interp is a compiler. Hence mix mix is a maker of compilers, or a compiler-compiler. It can make a compiler for any language for which an interpreter exists, provided the interpreter is written in the language processed by mix. (In our examples that language has been is C).
first projection | mix progparticularized specializer
| second projection | mix interp | compiler
| third projection | mix mix | compiler-compiler
| |