define(`_',`dnl')_ unobtrusive dnl, used for readability define(`__',`')_ indentation macros define(`____',`')_ think of them as spaces that can't define(`______',`')_ stick around and get in the way _ _ This program is distributed under a Creative Commons _ Share-alike license. An accompanying suite of tests _ may be run thus: "m4 barem4.m4 testbarem4.m4". _ _ _ Pure macros as a programming language _ _ m4 is Turing complete even when stripped to the bare minimum _ of one builtin: `define'. This is not news; Christopher _ Strachey demonstrated it in his ancestral GPM, described in _ "A general- purpose macrogenerator", The Computer Journal 8 _ (1965) 225-241. _ _ This m4 program more fully illustrates universality by _ building familiar programming capabilities: unlimited _ precision integer arithmetic, boolean algebra, conditional _ execution, case-switching, and some higher-level operators _ from functional programming. In support of these normal _ facilities, however, the program exploits some unusual _ programming idioms: _ _ 1. Case-switching via macro names constructed on the fly. _ 2. Equality testing by redefining macros. _ 3. Representing data structures by nested parenthesized lists. _ 4. Using macros as associative memory. _ 5. Inserting nested parameter symbols on the fly. _ _ Idioms 2 and 5 are "reflective": the program writes code _ for itself. _ _ Note. There is a `dnl' on every line of this program. The _ indulgence of using this builtin is logically unnecessary. _ Stripped of `dnl's and indentation macros, the program would _ consist entirely of `define's with no comments or line breaks. _ It would be less than 1/4 the current size and would yield the _ same results, but would be inscrutable. _ _ In examples of stepwise macro expansion, all occurrences of _ `dnl' and indentation have been elided. _ _ For consistency, we quote all arguments of `define' unless _ they contain macro calls that must be expanded at the _ time of definition. _ _ Case switch (Idiom 1) _ _ An essential feature of programming is conditional execution. _ Suppose there are predicates that yield `T' or `F' for true _ or false. A conditional switch of the form _ _ if(, `', `') _ _ constructs calls for `if_T', or `if_F' that select the _ appropriate action: _ define(`if', `if_$1(`$2', `$3')')_ define(`if_T', `$1')_ define(`if_F', `$2')_ _ _ Example _ _ if(T,`yes',`no') => if_T(`yes',`no') => yes _ if(F,`yes',`no') => if_F(`yes',`no') => no _ _ (An arrow => signifies a single macro-expansion step.) _ _ Again for consistency, we quote the second and third arguments _ of `if' unless necessity dictates otherwise. _ _ Basic Boolean functions are easy to define in terms of `if'. _ define(`not', `if(`$1', `F', `T')')_ define(`and', `if(`$1', `$2', `F')')_ define(`or', `if(`$1', `T', `$2')')_ define(`nor', `not(or($1,$2))')_ _ _ List representation (Idiom 3) _ _ In order to provide access to individual members, sequences _ of data values may be represented as right-associated lists. _ In particular, binary integers may be represented in this _ manner: _ _ (1,(0,(1,(1,())))) _ _ To facilitate arithmetic algorithms, the presentation is _ little-endian. In customary base-2 notation the example _ becomes 1101 (decimal 13). An empty list `()' acts as a _ terminator. Zero is normally represented by the empty _ list. Other values normally end in (1,()). Although _ insignificant 0 bits after the last 1 in a list are _ generally harmless, algorithms should not gratuitously _ create them. _ _ Data may be saved under arbitrary names (Idiom 4). Here _ we give more readable names for some small numbers. _ Quotes have been omitted from the definition for `two', _ so that `one' gets expanded at definition time rather _ than at each call of `two', paying a small amount of _ space to save time at each call. _ define(`zero', `()')_ define(`one', `(1,())')_ define(`two', (0,one))_ _ _ Individual list elements may be accessed by a pun, in which _ the outer parentheses of a list are taken to be the _ argument-list delimiters in a macro call. Two basic macros _ use this scheme to extract "head" and "tail" parts, _ and , from a nonempty list: _ _ head((,)) ==> _ tail((,)) ==> _ _ (A long arrow ==> signifies more than one macro expansion.) _ define(`head', `_head$1')_ define(`_head', `$1')_ _ define(`tail', `_tail$1')_ define(`_tail', `$2')_ _ _ Examples _ _ head(two) => head((0,(1,()))) => _head(0,(1,())) => 0 _ tail(two) => tail((0,(1,()))) => _tail(0,(1,())) => (1,()) _ _ According to the rules of m4, `head' and `tail' also work on _ the empty list, `()', in which case both functions yield an _ empty string, `'. This behavior will be exploited. _ _ Testing equality of primitive values (Idiom 2) _ _ The `if' macro checks which of two possible values (`T' or `F') _ appears as its first argument. More trickery is required to _ check equality of arbitrary primitive values--sequences of _ characters that can occur in macro names, e.g; `Abc', `3_2_1'. _ Following Lisp, we call such a sequence an "atom". _ _ We first define the outcome of the test of the given atoms to _ be F. Then we define the test of one of the atoms against _ itself to be T. If the two atoms are the same, the second _ definition replaces the first. Finally we call the first. _ define(`eq',_ __`define(`eq_$1_$2', `F')'_ __`define(`eq_$1_$1', `T')'_ __`eq_$1_$2')_ _ _ Examples _ _ eq(a,b) => define(`eq_a_b',`F')define(`eq_a_a',`T')eq_a_b _ ==> eq_a_b => F _ eq(a,a) => define(`eq_a_a',`F')define(`eq_a_a',`T')eq_a_a _ ==> eq_a_a => T _ _ If the T or F result of `eq' or other boolean-valued macro _ is to be juxtaposed with succeeding text, care must be _ taken to prevent unwanted catenation. An intervening _ quoted empty string suffices to make the separation. _ _ eq(a,b)`'eq(c,d) ==> FF _ eq(a,b)eq(c,d) ==> Feq(c,d) _ _ Unfortunately the temporary definitions stick around as _ "ghosts" that waste memory. We could relax our notion of _ purity and use the builtin macro `undefine' to clean up. _ _ A test for the empty list: _ define(`isempty', `eq(head($1), `')')_ _ _ Equality of lists of atoms can be tested by `eql', _ which uses `eq' recursively. The indentation in this _ definition and others like it is intended to imitate a _ sequence of else-less if's. _ define(`eql',_ __`if(isempty($1), `isempty($2)',_ __`if(eq(head($1),head($2)), `eql(tail($1),tail($2))',_ __`F')')')_ _ _ Example _ _ eql((W,(H,(O,()))), (W,(H,(O,(M,()))))) _ => if(isempty((W,(H,(O,())))), ...) _ => if(eq(W,`'), ...) _ ==> if(eq(W,W), `eql((H,(O,())),(H,(O,(M())))), ...) _ ==> if(isempty(()), `isempty((M,())),, ...) _ ==> F _ _ Here we use `eq' to define a predicate for testing whether _ a number, possibly with insignificant zero digits, is zero. _ define(`iszero',_ __`if(isempty($1), `T',_ __`if(eq(head($1),0), `iszero(tail($1))',_ __`F')')')_ _ _ Example _ _ iszero(()) _ ==> if(eq(,),`T',`if(...)') _ ==> if(T,`T',`if(...)' ==> T _ _ iszero((0,(1,()))) _ ==> if(eq(0,),...,`if(eq(head((0,(1,()))),0),...)') _ ==> iszero(tail(0,(1,()))) _ ==> iszero((1,())) ==> F _ _ Basic to all arithmeic is the successor function. _ define(`succ',_ __`if(iszero($1), one,_ __`if(eq(head($1),0), `(1,tail($1))',_ __`(0,succ(tail($1)))')')')_ _ define(`three', succ(two))_ define(`four', succ(three))_ _ _ In the definition of `succ', the behavior for each of the _ three possible values of the head digit of the argument _ (empty, `0' and `1') is presented in a different context: _ as the first branch of an `if', as the first branch of _ a nested `if', and as the second branch of a nested `if'. _ Parentheses become hard to match. It would be cleaner to _ use a 3-way switch. _ _ Case-switching on attributes of arguments _ _ Unfortunately, overt switch code relies on Idiom 1, a _ glaring deviation from the mainstream functional style _ used to define `iszero' and `succ'. A remedy is at hand: _ higher-level switch-defining macros that hide Idiom 1. _ _ Switching is based on one or more "decision attributes", _ atomic values derived from the arguments. The type of _ a switch depends on the numbers of decision attributes _ and parameters. _ _ switch decision parameters _ type attributes _ switch1of1 1 1 _ switch1of2 1 2 _ switch2of2 2 2 _ _ Other switch types are named similarly. _ _ Warning. The discussion from here through the definition _ of `switch1of1' may be heavy going. You may wish to skim _ it then skip around to bring the whole into focus. _ _ Each switch declaration defines a function and the _ decision attributes expressed as if they were in _ the body of the macro to be declared. A switch _ declaration must be accompanied by `case' macros _ for all valid sets of values of the decision _ attributes. (`case', is merely a distinctive _ synonym for `define'.) _ define(`case', `define(`$1', `$2')')_ _ _ The name of a case macro comprises the name of the _ original function and values of the decison _ attributes separated by underscores. For example, _ to define `iszero' in switch style, the decision _ attribute is the head of the argument; and we must _ define cases, `iszero_', `iszero_0' and `iszero_1' _ for the three possible values of the attribute _ _ switch1of1(`iszero',`head($1)')_ _ case(`iszero_', `T')_ _ case(`iszero_0', `iszero(tail($1))')_ _ case(`iszero_1', `F')_ _ _ In the definition of the switch-declaring macro, _ `switch1of1', $1 is the name of the function to be _ declared, notionally like this _ _ define(`switch1of1',_ _ __`define(`$1', `(`$1',)()')') _ _ where pastes together the function name and a _ value of the decision attribute, . The constructed name _ will be called with argument . When $1 is `iszero', say, _ is `head($1)' and is $1. Unfortunately $1 _ already means `iszero'. To resolve the clash, we arrange _ for the appearance of $1 in to be inserted only when _ the inner `define' is executed. Thus such instances of $1 _ do not occur in the replacement text of `switch1of1'. _ _ The trick (Idiom 5) is to replace $ by a "dollar macro", _ `DOL()', which expands to `$'. _ define(`DOL',``$$1'')_ _ _ The doubled quotes will be explained later. For `DOL' to be _ expanded when the inner `define' is executed, the two _ constructs must appear at the same quotation level. This _ typically entails flanking `DOL' with outward-facing quotes _ as in the definition of `switch1of1' below. _ _ As the number of decision attributes differs among switch _ types, needs multiple realizations. We call the _ macro that expands to a for one decision _ attribute `_switch1'. _ define(`switch1of1',_ __`define(`$1', `_switch1(`$1',$2)('DOL(1)`)')')_ define(`_switch1', `$1_$2')_ _ _ Here is the successor function programmed in switch style. _ switch1of1(`succ', `head($1)')_ case(`succ_', one)_ case(`succ_0', `(1,tail($1))')_ case(`succ_1', `(0,succ(tail($1)))')_ _ _ Examples _ _ switch1of1(`succ', `head($1)') _ ==> define(`succ', `_switch1(`succ',head($1))($1)') _ _ succ((1,())) _ => _switch1(`succ', head((1,()))((1,()))) _ => _switch1(`succ', 1)((1,()))) _ => succ_1((1,())) _ => (0,succ(tail(1,()))) _ ==> (0,succ(())) _ ==> (0,(1,())) _ _ Different numbers of switching attributes and parameters _ are needed for defining other functions. The following _ definitions are routine variations on `switch1of1'. _ define(`switch2of2',_ __`define(`$1',_ ____`_switch2(`$1',$2,$3)('DOL(1)`,'DOL(2)`)')')_ define(`_switch2', `$1_$2_$3')_ _ define(`switch1of2',_ __`define(`$1',_ ____`_switch1(`$1',$2)('DOL(1)`,'DOL(2)`)')')_ _ define(`switch1of3',_ __`define(`$1',_ ____`_switch1(`$1',$2)('DOL(1)`,'DOL(2)`,'DOL(3)`)')')_ _ define(`switch2of3',_ __`define(`$1',_ ____`_switch2(`$1',$2,$3)('DOL(1)`,'DOL(2)`,'DOL(3)`)')')_ _ _ Here we define some basic arithmetic functions. For _ illustrative purposes, addition is defined by switching _ on both arguments, while multiplication is defined by _ switching on only the first argument. Both definitions _ can produce insignificant zero bits when inputs have _ insignificant zero bits. _ switch2of2(`add', `head($1)',`head($2)')_ case(`add__', `()')_ case(`add__0', `$2')_ case(`add__1', `$2')_ case(`add_0_', `$1')_ case(`add_0_0', `(0,add(tail($1),tail($2)))')_ case(`add_0_1', `(1,add(tail($1),tail($2)))')_ case(`add_1_', `$1')_ case(`add_1_0', `(1,add(tail($1),tail($2)))')_ case(`add_1_1', `(0,succ(add(tail($1),tail($2))))')_ _ define(`mul', `if(iszero($2), `()', `_mul($1,$2)')')_ switch1of2(`_mul', `head($1)')_ case(`_mul_', `()')_ case(`_mul_0', `(0,_mul(tail($1),$2))')_ case(`_mul_1', `add($2,_mul_0($1,$2))')_ _ _ Subtraction. Because of the lack of negative numbers, _ we provide a "diminish" function, `dim', that clamps _ negative results at zero. _ _ dim(a,b) = max(a-b,0) _ _ The definition of `_dim' exploits the fact that $1>$2 _ when it is called from `dim'. Thus at every recursive _ level $1>=$2. The last line of `_dim', where $1=(0,...) _ and $2=(1,...), implements traditonal "borrowing" from _ the tail of $1 by incrementing the tail of $2. _ define(`dim', `if(le($1,$2), `()', `trim(_dim($1,$2))')')_ define(`_dim',_ __`if(iszero($2), `$1',_ __`if(eq(head($1),head($2)), `(0,_dim(tail($1),tail($2)))',_ __`if(eq(head($1),1), `(1,_dim(tail($1),tail($2)))'_ __,`(1,_dim(tail($1),succ(tail($2))))')')')')_ _ _ `if' can be defined via `switch1of3', however the resulting _ implementation is less efficient than the original. _ _ switch1of3(`if',`$1')_ _ case(`if_T', `$2')_ _ case(`if_F', `$3')_ _ _ The newer definition depends crucially on the doubled _ quotes in the definition of `DOL' to protect the _ arguments of a generated call of a case macro from _ being scanned for macro calls, which would result _ in unintended executions. _ _ `rev' reverses the order of elements in a list. The _ result is accumulated in the second argument of `_rev'. _ define(`rev', `_rev($1,())')_ define(`_rev',_ __`if(isempty($1), $2,_ __`_rev(tail($1),(head($1),$2))')')_ _ _ `trim' deletes insignifcant zero bits from a number. _ define(`trim',_ __`if(eq(head($1),`'), `()',_ __`_trim(head($1),trim(tail($1)))')')_ define(`_trim',_ __`if(and(isempty($2),eq($1,0)), `()',_ __`($1,$2)')')_ _ _ Example _ _ trim(rev((0,(0,(1,()))))) _ ==> trim(_rev((0,(0,(1,()))),())) _ ==> trim(_rev((0,(1,())),(0,()))) _ ==> trim((1,(0,(0,())))) _ => _trim(1,(0,(0,()))) _ ==> (1,()) _ _ For human consumption, `base2' converts results from _ lugubrious list form to ordinary binary notation. _ define(`base2',_ __`if(eq(head($1),`'), `0', `_base2($1)')')_ define(`_base2',_ __`if(eq(head($1),`'), `', `_base2(tail($1))head($1)')')_ _ _ A counter value kept in a macro (per Idiom 4) may be _ updated by `incr' or read out by expanding it. _ define(`incr', `define(`$1',succ($1))')_ _ _ Examples _ _ base2((0,(1,(0,())))) ==> 010 _ base2(trim((0,(1,(0,()))))) _ ==> base2((0,(1,()))) ==> 10 _ _ define(mycounter,zero) _ incr(`mycounter') _ incr(`mycounter') _ base2(mycounter) ==> 10 _ _ `nth(,)' yields item number (counted from 0) in list _ , or `' if is too big. _ define(`nth',_ __`if(isempty($2), `',_ __`if(iszero($1), `head($2)',_ __`nth(dim($1,one),tail($2))')')')_ _ _ Some functional programming staples are `map', `fold' and _ `zip'. `map' applies a given function to every element of a _ list. Its use with m4 is hampered by the restriction of list _ elements to atoms. _ _ map(, (,(,...(,())))) = _ ((),((),...((),())...)) _ define(`map',_ __`if(isempty($2), `()',_ __`($1(head($2)),map(`$1',tail($2)))')')_ _ _ Folds combine all the elements of a list into a single value by _ applying a binary operation iteratively, beginning with , _ and proceeding in right-associative order for `foldr' and left- _ associative for `foldl'. _ _ foldr(, , (,(,...(,())))) = _ (...()...) _ is a function of two variables, written here in infix form _ define(`foldr',_ __`if(isempty($3), $2,_ __`$1(head($3),foldr(`$1',`$2',tail($3)))')')_ _ _ foldl(, , (,(,...(,())))) = _ (...(())...) _ define(`foldl',_ __`if(isempty($3), $2,_ __`foldl(`$1',$1($2,head($3)),tail($3))')')_ _ _ `trim', `rev' and the `base2' workhorse function `_base2' _ can be defined as folds. `_trim' is as before. _ define(`trim', `foldr(`_trim',`()',`$1')')_ _ define(`rev', `foldl(`_rev',`()',`$1')')_ define(`_rev',`($2,$1)')_ _ define(`_base2', `foldr(`_base2_',`',`$1')')_ define(`_base2_',`$2$1')_ _ _ Length of a list defined as a fold. The fold calls `succ' _ with two arguments; `succ' uses only the first one. _ define(`length', `foldl(`succ',zero,$1)')_ _ _ zip(, , ) makes a list of (,) where , _ are corresponding elements of lists and _ define(`zip',_ __`if(isempty($2), `()',_ __`if(isempty($3), `()',_ __`($1(head($2),head($3)),zip(`$1',tail($2),tail($3)))')')')_ _ _ Iteration _ _ We define an iterative "for statement" that repeatedly calls _ a macro with an indexed argument. Its meaning may be described _ informally thus: _ _ for(lo,hi,`func') = for i from lo to hi do func(i) _ define(`for',_ __`if(gt($1,$2), `',_ __`$3($1)'_ __`for(succ($1),$2,`$3')')')_ _ _ `forA' allows a further argument to be passed to `func'. _ A hack for passing multiple arguments is to hide the _ separating commas in quotes. _ _ forA(lo,hi,`func',arg) = for i from low to high do func(arg,i) _ define(`forA',_ __`if(gt($1,$2), `',_ __`$3($4,$1)'_ __`forA(succ($1),$2,`$3',`$4')')')_ _ _ The double loop below prints a binary addition table. _ _ define(`table', `for(zero,one,`row')') _ define(`row', `forA(zero,one,`cell',$1)newline()') _ define(`cell', `base2(add($1,$2))tab()') _ define(`tab', ` ') _ define(`newline',` _ ')_ _ _ Example _ _ table() ==> _ 0 1 _ 1 10 _ _ _ Comparison functions _ _ Numerical comparison operators may be built on `compare', _ a function that yields `LT', `EQ', or `GT' according as _ its first argument is less than, equal to, or greater _ than the second. The auxiliary function `_compare' breaks _ ties between tails. _ switch2of2(`compare', `head($1)',`head($2)')_ case(`compare__', `EQ')_ case(`compare__0', `if(iszero($2), `EQ', `LT')')_ case(`compare__1', `LT')_ case(`compare_0_', `if(iszero($1), `EQ', `GT')')_ case(`compare_0_0', `compare(tail($1),tail($2))')_ case(`compare_0_1', `_compare(compare(tail($1),tail($2)), LT)')_ case(`compare_1_', `GT')_ case(`compare_1_0', `_compare(compare(tail($1),tail($2)), GT)')_ case(`compare_1_1', `compare(tail($1),tail($2))')_ _ define(`_compare', `if(eq($1,EQ), `$2', `$1')')_ _ _ The customary two-way comparison predicates can be defined _ in terms of `compare'. A typical example is the "less than" _ predicate, `lt'. _ switch1of2(`lt',`compare($1,$2)')_ case(`lt_LT', `T')_ case(`lt_EQ', `F')_ case(`lt_GT', `F')_ _ _ To avoid the tedium of six such declarations, we _ capture their pattern in `mkcomp'. `mkcomp' is a _ third-order function; it defines definers of _ numerical comparison predicates. _ define(`mkcomp',_ __`switch1of2(`$1',`compare($'`1,$'`2)')'_ __`case(`$1_LT', `$2')'_ __`case(`$1_EQ', `$3')'_ __`case(`$1_GT', `$4')')_ _ mkcomp(`lt', `T', `F', `F')_ mkcomp(`le', `T', `T', `F')_ mkcomp(`et', `F', `T', `F')_ equal to, differs from eq mkcomp(`ne', `T', `F', `T')_ mkcomp(`ge', `F', `T', `T')_ mkcomp(`gt', `F', `F', `T')_ _ _ Example _ _ et((0,()),()) ==> T _ eql((0,()),()) ==> F _ _ Comparing atoms for, say, alphabetic order is not a _ simple task; we need somehow to decide the question _ consistently for every possible pair of values. The _ function `cmp' does so for single alphanumerics by _ finding which member of the pair comes first in a _ list of the alphabet. _ define(`cmp',_ __`if(eq(`$1',`$2'), `EQ',_ __`_cmp(`$1',`$2',head(alphabet),tail(alphabet))')')_ _ define(`_cmp',_ __`if(eq(`$1', `$3'), `LT',_ __`if(eq(`$2', `$3'), `GT',_ __`if(isempty($4), `?',_ __`_cmp(`$1',`$2',head(`$4'),tail(`$4'))')')')')_ _ define(`alphabet',_ `(`0',(`1',(`2',(`3',(`4',(`5',(`6',(`7',(`8',(`9',_ (`A',(`a',(`B',(`b',(`C',(`c',(`D',(`d',(`E',(`e',_ (`F',(`f',(`G',(`g',(`H',(`h',(`I',(`i',(`J',(`j',_ (`K',(`k',(`L',(`l',(`M',(`m',(`N',(`n',(`O',(`o',_ (`P',(`p',(`Q',(`q',(`R',(`r',(`S',(`s',(`T',(`t',_ (`U',(`u',(`V',(`v',(`W',(`w',(`X',(`x',(`Y',(`y',_ (`Z',(`z',()))))))))))))))))))))))))))))))))))))))_ ))))))))))))))))))))))))')_ _ _ `alphabet' is fragile. The quotes in `alphabet' _ disappear from the tail copied as an argument of _ `_cmp'. Then one-letter macro names will affect _ the alphabet and ruin comparisons as here, where _ A has effectively been eliminated from the alphabet. _ _ define(`A',`B')cmp(`z',`A') ==> LT _ _ Although it is possible with more processing to _ retain quotes in each copied tail, `cmp' is already _ costly in both time and space. Iteration over the _ alphabet involves copying N tails of decreasing _ length, for a total volume of O(N^2) list elements, _ where N is the size of the alphabet. Each such _ iteration creates ghost definitions, up to N^2 _ of which can accumulate over multiple executions _ of `cmp'. _ _ Ghost definitions are not only numerous; they are _ repeatedy redefined. If all alphanumeric pairs are _ equally likely, an average of about 120 redefinitions _ and 60 copies of tails will occur for each execution _ of `cmp'. _ _ The overheads of redefinition and alphabet-copying can _ be avoided by using an N^2-way switch instead of linear _ search in the alphabet. We automate the construction _ of the N^2 case declarations with `mkcmp' and a few _ helper functions defined below. _ _ Making the case macros all at once will in the long _ run beat making a lot of ghost definitions for every _ character comparison. Moreover, the noted fragility _ of `alphabet' may be mitigated, for no comparison _ macros refer to `alphabet' after the 3844 case _ macros for `cmp' have been defined. _ switch2of2(`cmp', `$1', `$2')_ _ _ The declaration of case macros is automated by`mkcmp', _ defined below after some helper macros. _ _ `mkcmp0' makes comparison functions for $1 against _ $2, which is known to follow $1 in alphabetic order _ define(`_mkcmp0',_ __`case(`cmp_$1_$2', `LT')'_ __`case(`cmp_$2_$1', `GT')')_ _ _ `_mkcmp1' makes comparison functions for $1 against all _ characters in list $2. _ define(`_mkcmp1',_ __`if(isempty($2), `',_ ____`_mkcmp0($1,head($2))'_ ____`_mkcmp1($1,tail($2))')')_ _ _ `_mkcmp2' makes comparison functions for all pairs _ constructible from $1 and elements of list $2 of _ alphabetically following characters. _ define(`_mkcmp2',_ __`case(`cmp_$1_$1', `EQ')'_ __`if(isempty($2), `',_ ____`_mkcmp1($1,$2)'_ ____`_mkcmp2(head($2),tail($2))')')_ _ define(`mkcmp', `_mkcmp2(head($1),tail($1))')_ mkcmp(alphabet)_ _ _ Strings can be defined as lists of characters, and _ compared lexicographically by a list-comparison _ function, `cmpl'. For the benefit of string _ manipulation, one-character macro names should be _ avoided. _ define(`cmpl',_ __`if(and(isempty($1),isempty($2)), `EQ',_ __`if(isempty($1), `LT',_ __`if(isempty($2), `GT',_ __`_cmpl($1,$2)')')')')_ _ switch1of2(`_cmpl',`cmp(head($1),head($2))')_ case(`_cmpl_LT', `LT')_ case(`_cmpl_EQ', `cmpl(tail($1),tail($2))')_ case(`_cmpl_GT', `GT')_ _ _ Example _ _ cmpl((A,(N,())), (A,(M,()))) _ ==> _cmpl((A,(N,())), (A,(M,()))) _ ==> switch1(`_cmpl', cmp(A,A))((A,(N,())), (A,(M,()))) _ ==> _cmpl_EQ((A,(N,())), (A,(M,()))) _ ==> cmpl(tail((A,(N,()))), tail((A,(M,())))) _ ==> GT _ _ For An N-character alphabet, `mkcmp' creates N^2 macros, _ and `cmp' takes O(1) time. A set of N macros, `ord_', _ that give the ordinal position of each character _ in the alphabet would achieve a different time-space _ tradeoff, in which `cmp' becomes `compare(ord_$1,ord_$2)' _ and takes O(log N) time. _ _ _ Universality _ _ It's not hard to envision simulating a simple computer. Thus, _ aside from resource limitations, bare m4 is Turing complete. _ Here's a workable plan: _ _ The program counter is maintained by `incr', and converted _ to string form by `base2'. _ _ Word addresses are binary numbers. _ _ The contents of word n are held in a macro named W, where _ is the base-2 form of the address. _ _ Words need not be limited in size. Numbers may be kept _ in the form (sign,magnitude). (Caveat: the sign can't _ be `+' or `-', because programs can only switch on _ alphanumerics.) _ _ Instructions may be of variable length--one word of opcode, _ and following words that contain immediate operands or _ addresses expressed in binary form. _ _ Instruction codes are a limited set of binary numbers , _ and may be switched on by converting them to base-2 form I. _ _ An assignment redefines the designated word. _ _ Other instructions are programmed in styles illustrated above. _ _ Each instruction updates the location counter appropriately. _ _ A control macro fetches an opcode according to the location _ counter. If the opcode is a halt instruction, the control _ terminates. Otherwise it calls the designated operation, _ then calls the control recursively. _ _ Eric Blake has carried out this general plan for an _ "intcode" computer and has been able to run impressive _ demos; see https://adventofcode.com/2019/day/2 and _ https://repo.or.cz/aoc_eblake.git/commitdiff/ _ 798d9cd09c7b499316713bbe064b420544fbc0f8 _ _ _ Footnotes _ _ Some efficiency may be gained by replacing separate calls _ of `head' and `tail' with a single call of a `split' macro, _ at some loss of clarity. In the version of the succcessor _ function below, `_succ' has head and tail arguments, yet _ no commas are visible in calls for `_succ'. _ define(`split', `_split$1')_ define(`_split', `$1,$2')_ _ define(`succ', `_succ(split($1))')_ switch1of2(`_succ', `$1')_ case(`_succ_', one)_ case(`_succ_0', `(1,$2)')_ case(`_succ_1', `(0,_succ(split($2)))')_ _ _ _ Acknowledgement and Colophon _ _ I am indebted to Eric Blake for his careful reading of, and _ insightful comments on a previous version of this program. _ He inspired the current implementation of the critical _ dollar macro and suggested `split'. _ _ M. Douglas McIlroy _ Dartmouth College _ August 2020 _ January 2021: comments tweaked _ September 2024: replace eq to work on arbitrary atoms _ June-July 2025: revise switch declarations; add iteration, _ comparison functions, map, fold, zip and appendix _ _ Appendix _ _ This appendix rounds out the usual complement of _ arithmetic operators on natural numbers. No new _ programming techniques are introduced. Like `add' and _ `mul', these operators yield "normal" numbers with _ no insignificant zeros when given normal inputs, but _ are not guaranteed to do so otherwise. _ _ Square. _ define(`square', `mul($1,$1)')_ _ _ Exponentiation. pow(a,n) returns a^n. If n=b+2*n', where _ b is 0 or 1, pow(a,n) = a^b*pow(a^n')^2. As the code is _ written, 0^0 is taken to be 1. _ define(`pow',_ __`if(iszero($2), `one',_ __`if(eq(head($2),0), `square(pow($1,tail($2)))',_ __`mul($1,square(pow($1,tail($2))))')')')_ _ _ Division. The customary integer operators `div' and `mod'_ _ are defined in terms of a quotient-remainder operator, `qr'. _ define(`div',`head(qr($1,$2))')_ define(`mod',`tail(qr($1,$2))')_ _ _ qr(n,d) yields a pair (q,r), the integer quotient and _ remainder for the division operation n/d, where d>0. _ The result is expressed in "strict (q-r) form". _ _ For expository purposes, we assume d fixed and write n _ in "q-r form", [q,r], where n = q*d+r. When r