File Projects.html    Author McKeeman    Copyright © 2007    index

Projects

If you cannot solve the proposed problem
try to solve first some related problem.
Could you imagine a more accessible related problem?
A more general problem?
A more special problem?
An analogous problem?
George Polya, How to Solve It (1945)

The compiler project is to extend X in some way that you find useful. This does not, however, have to be done all in one step. One can, to paraphrase Polya, "try a simpler similar problem first." Extending xcom carries some expectations, such as the ability to do input/output for all types of values and preserving the typesetting pretty printer.

The result of the compiler project is

  1. A live demonstration.
  2. A report signed by all team members.

  • Warm up exercise.

    Add the absolute value function to X. You should follow each of the recommended steps to make a little change in X. This warm up exercise avoids any language design, and any serious pitfalls. It will be hard enough without them.

  • Change xcom

    The recommended method for changing xcom involves the order of events and also the use of the unit tests. When you get stuck, ask for help.

  • Adding a scalar type (relatively easy)

    Suppose you wanted to do more accurate scalar computations.

    • Double Precision

      You might prefer 64-bit IEEE to the current xcom 32-bit IEEE implementation of real variables. In fact, the hardware always does 64-bit arithmetic; xcom goes to some trouble to truncate the results after every operation. Expanding the storage for scalars in the run-time frame is the only substantive part of this extension. Not much to be learned about compiler writing. It is probably reasonable to extend the integer precision to 64 bits at the same time so as to have a consistent storage model.

    • Loooong Integers

      Some computations require many digits of integer accuracy. One can extend xcom without changing the grammar at all by simply changing the meaning of integer constants. The (difficult) implementation of the arithmetic can be left to the BIGNUM package. Your problems are accessing its functions using the same mechanism that xcom currently uses for /, // and rand and also how to handle the operands that BIGNUM gives you.

    • Extended Accuracy

      The BIGNUM package offers much more than big integers. One could implement big rational numbers or super-accurate reals.

    • Complex

      Replace real with complex, or add a new type complex. You may want to reserve "i" for the square root of -1. Do not push too much off onto the lexer. Expression

          a+i*b
        
      is already syntactically correct. You might want to add builtin functions modulus and phase.

    • People

      One group decided to make a scalar type for people. The language had to do with relationships. Identifiers starting with a single uppercase letter such as John were constants.

  • Adding an operator

    While one can invent arbitrary operators composed out of ASCII characters, it is often best to overload an existing operator to keep X from getting cluttered. The addition of set union and string catenation are discussed below under the corresponding data structures. The warm up exercise is an example where the operator is abs.

    Consider

    • adding real-valued builtins inf and nan and logical-valued builtins isinf and isnan.
    • adding   ~~   &&   ||   to mean short-circuit logical evaluation
    • extending   ~   & and   |   to type integer (giving 32-bit results). Pehaps add ^ meaning exclusive or.

    All of mathematics awaits your implementation.

  • Adding a data structure

    The X language has only scalar variables. This makes it nearly impossible to write useful programs. One is, then, highly motivated to add structured data variables.

    You should first write a number of X programs in your proposed extension. This provides both a set of tests and a confirmation that you will actually find the extension useful. You should then extend the X reference manual to document your language. If you do not want to grapple with LaTeX, you can publish an HTML appendix instead. Then you are ready to extend xcom. Here are some suggestions but don't let them interfere with your creativity. By the way, python has worked out many of the language design problems for data structures. It is OK to peek.

    • Sets of scalar values.

      One can imagine writing

          X := {1,2,3};
          y := choice(X);       ` a randomly chosen member
          Z := {i2r(y)} | {3.141592};
          if ~(4.0 in Z) ? fi;  ` assert 4.0 is NOT there
          X := X - {y};         ` discard y from X
        
      wherein finite sets of values are allowed. In essence this means adding a few new symbols to X.cfg, three new types (logicalSetType, integerSetType, realSetType) to the symbol table, and then carrying through the implementation.

      A set computation will necessarily allocate new storage for each result. The storage needs to be accessible from your emitted code, so it is more convenient to let C allocate and free the storage. The mechanism for getting into and out of C is in MEX file getCfun.c. You might consider how to avoid storage leaks. You might also consider a set-complement operation.

      xcom contains a convenient mechanism for dropping into C for such actions (see the implementaton of /, // and rand). Note the implied overloading of the '|' operator to mean set union above. You may, of course, choose one of many other ASCII notational devices, perhaps '||' or '|_|' or '\/' or 'U'.

      The sets idea quickly leads to sets of anything, including sets. The mathematics can get deep fast. Step carefully.

      The pretty printing typesetter can choose different glyphs to represent a single operator ('|', for example might map into logical or   ∨   for booleans and set union   ∪   when the operands are sets).

    • Strings

      One can imagine writing

          X := "abc" + "def";
          y := 3;
          Y := X[y:y+1];
          z := c2i(X[2]);
        

    • Vectors of scalar values

      One can imagine writing

          y := 3;
          X := [1,2,y];
          Y[3] := X[y-1];
        

      As in MATLAB, vectors expand storage to hold any value sent their way. Accessing an undefined value causes a run error.

    • Lists of lists

      One can imagine writing

            y := 3;
            X := <1,2,y>;
            Y := <X, <7,9.0>> + <3.12>;
            Z := tail(Y);
          
  • Doing science

    Not all computation are just grown-up FORTRAN. Some science requires algebraic manipulations. Some has data that is nothing like the computers give us. Compiler writers are rarely competent in such science, and such scientists are rarely competent in compiler writing. There is an opportunity for a big advance here.

    If one is going to achieve Dijkstra's "compelling and deep logical beauty," one must forgo much of the paraphernalia of conventional general purpose programming languages. This spartan attitude is most likely to succeed when the objective of the language is rigorously limited, the linguistic additions are few, and additions are well suited to meeting that objective. See some ideas on computer language design to get a fresh perspective.

    • Chemistry

      A chemist writes formulas in which the operands are chemicals. The equations enforce constraints and imply reactions. The HTML presentation here does not do justice to this idea but X can be typeset by xcom or even rendered by xcom graphics.

      water := 2H + O;
      h := water.enthalpy;

      methane := C + 2H2

      caffeine.name = '3,7-Dihydro-1,3,7-trimethyl-1H-purine-2,6-dione'
      caffeine.diagram

  • Little Languages

    Many problems are easier to solve with a little language . There are many successful examples. Instead of modifying xcom, you start from scratch. All things considered, it best to make this your second project.