CS118(2) homework #2

Monad-based parsing

Here is a recursive-descent parser for part of the language X. Besides standard library functions, it imports parsing primitives and combinators from Parser.hs. An easy way to get Parser.hs into scope for importing is to copy it to your working directory.

The parser closely follows the grammar given in in ~/mckeeman/xcom/fe/cfg/X.cfg, with the exception that it does not allow silly assignments with empty left-hand sides.

The parser, X.hs

-- Parser for McKeeman's X, above the level of expression

import Parser
import Data.Char	-- for isAlpha, etc
import System(getArgs)

-- ------------------- TOKENS ------------------------

letter = char isAlpha

white, identifier, integer :: Parser String

white = some (char (`elem` " \t\n")) `orelse` nil

identifier = do
		c <- letter
		cs <- some (char isAlphaNum) `orelse` nil
		return (c:cs)

integer = some (char isDigit)

keyword s = do
		white
		string s
		white

eof :: Parser ()
eof = do	
		b<-isPresent next
		if b then zero else return ()

-- ------------- SYNTACTIC CATEGORIES ----------------------

data Cat = Prog [Cat]
	 | If [Cat]
	 | Do [Cat]
	 | Cond Expr [Cat]
	 | Assign [Var] [Expr]
	 | Call [Var] Name [Expr]
	 | Bad
	   deriving Show

data Expr = Expr Name deriving Show
type Var = Name
type Name = String

program :: Parser Cat
program = do
		white
		xs<-stmts
		eof
		return (Prog xs)
	  `orelse` return Bad

stmts = seplist stmt (keyword ";")

stmt = selection `orelse` iteration `orelse` assignment

selection = do
		keyword "if"
		as<-alts
		keyword "fi"
		return (If as)

iteration = do
		keyword "do"
		as<-alts
		keyword "od"
		return (Do as)

alts = seplist alt (keyword "::")

alt = do
		g<-expr
		keyword "?"
		xs<-stmts
		return (Cond g xs)

expr = do
		i<-identifier
		return (Expr i)

exprs = seplist expr (keyword ",")

vars = do
		vs<-seplist identifier (keyword ",")
		return vs

assignment = (do
		vs<-vars
		keyword ":="
		p<-identifier
		keyword ":="
               	es<-(exprs `orelse` nil)
		return (Call vs p es))
     `orelse`(do
		vs<-vars
		keyword ":="
		es<-exprs
		return (Assign vs es))

-- --------------- INTERACTIVE TESTING -----------------------

{- Read programs, one per line, and print their parses.
   Interrupt to terminate. 

   Hugs usage (assuming this code is in Xparse.hs
  	hugs Xparse.hs
  	tests
  	
   GHC usage--warning: ghci does raw IO by default
  	ghci Xparse.hs
  	tests
  	
-}

tests = do 
		s<-getLine
		print [fst x| x<-apply program s]
		tests ::IO ()

-- --------------- STANDALONE PARSER -------------------------

-- Hugs usage, to parse an X program named p.x
--	runhugs Xparse.hs p.x
-- GHC usage
--	ghc --make Xparse.hs
--      ./a.out p.x

main = do
	[f]<-getArgs
	src<-readFile f
	print[fst x| x<-apply program src]

Exercises

  1. What happens if you reverse the order of the two do's in assignment?
  2. Augment the parser and the list of syntactic categories to parse expressions as defined in X.cfg. Test the result.
  3. Make a monadic pretty printer that converts a parse tree rooted at Prog into a sequence of lines. No line should contain more than one stmt, and nested stmts should be appropriately indented.

    Don't worry about line-length limitations or minimizing the number of parentheses in expressions. The point is to exploit monads to avoid explicitly carrying state; one state variable (indentation) is enough for proof of concept.

    Note. To print a sequence rs with each element terminated by a newline, you may use

             mapM print rs
    
    Why mapM and not simply map? To find out, experiment; and consult the standard prelude, Prelude.hs.