Topic: Introduction to Monads Date: Nov. 23, 2009 Number: 27 Examples: Monads.hs, Sheep.hs Reading: Chapter 18 Monads In Chap. 19 there is a system to control a graphical robot. It is a system combining features from Karel the Robot or Logo Turtle Graphics, for those who know of these systems. It shows how something close to imperative programming can be done in a functional language using a monad. This allows the commands to be listed in a "do" statement, which it turns out is syntactic sugar for monad commands. Monads are probably the most difficult topic that we will discuss this term. I considered not including them at all, and approach them with great trepidation. But we have been tiptoeing around them all term with the IO monad and other things that are monads (e.g. list comprehensions), so I want you to have at least some understanding of them. I hope that you will appreciate what can be done with them. Mondads are basically ways to sequence and combine computations. A place where they are particularly useful is in dealing with state. We saw in the when dealing with stacks and queues and in passing the map in the memoized EditDistance function that it can be a pain in the neck to pass state from one part of the computation to the next. Remember that a Parser was a function: type Parser a = String -> Maybe (a,String) We passed in the current string, and got back a value and the remaining string. So again we were passing around state. The convenient things about the combinators #, !, ?, >->, etc. was that they hid away all of the passing of the remaining string. As a side effect, they also caused the parsers to act on the string in a given order, because a parser could only act on the string being passed to it from the previous parser. So passing state gives a way to sequence actions, also. We will see that a state Monad basically uses the same idea to pass state - a few functions deal with passing state "under the hood" and the higher-level functions do not need to explicitly deal with state. Both of these things come up in I/O. Things need to be sequenced, so that a series of print statements is executed in the correct order. Also, there is an implicit state, which we might call RealWorld. This RealWorld state needs to be passed on in some sense, and monads will give us a way to do that also. Before we get into Monads, I first want to look at a simpler example. This is the Functor class. It came up in the spell checker program. -- Functor class The Functor class is defined: class Functor f where fmap :: (a -> b) -> f a -> f b Here f is something new. It is not a function that returns a value. It is a function that returns a TYPE. We have always considered something like "Tree String" or "Tree Int" or "Tree a" to be a type. But what is "Tree"? It is a type constructor - it takes a type as an argument and returns a type as a result. This is the type of "function" that f should be. If we talk about instance Functor Tree where then we are saying that fmap takes two parameters, a function from a -> b and a value of type Tree a. It returns a value of type Tree b. So how do we use this? We must define a function fmap: instance Functor Tree where fmap f (Leaf x) = Leaf (f x) fmap f (Branch t1 t2) = Branch (fmap f t1) (fmap f t2) Note that this is the same as treeMap defined before. And what functors do is to generalize map to structures other than list, so "fmap" can be called on any type that is an instance of functor. This includes lists, whose type constructor is written "[]": instance Functor [] where fmap = map As we saw with other type classes, we should have some "meaning" for what fmap is. We want it to satisfy two properties: fmap id = id fmap (f . g) = fmap f . fmap g The first says that the identity function for type a maps to the identity function for type f a. So if my functor is Tree, then looking at the definition above I can prove that this is true by induction, because it is true for the base case: fmap id (Leaf x) = Leaf (id x) => Leaf x and if I assume that it is true for trees smaller than Branch t1 t2, then: fmap id (Branch t1 t2) = Branch (fmap id t1) (fmap id t2) Using the inductive hypothesis on t1 and t2 gives => Branch t1 t2 which is what we want. The second property is that fmap of a composition is the composition of the fmaps. It can also be proved via induction, as we saw for regular maps. -- An example using sheep to show why we need monads A problem arises when using functions that return a Maybe. To compose them, you have to deal with the situation where they return Nothing. Furthermore, if the function is (a -> Maybe a) to compose two of these functions we have to take the value v out of the (Just v) before we can apply a second such function. I had a boring example with functions that computed sqrt and asin, which have invalid input values (e.g. negative numbers for sqrt). But I found an example written by Jeff Newber, appearing at http://www.haskell.org/all_about_monads/html/index.html which I am adapting instead. The idea is that sheep have parents, but cloned sheep (e.g. Dolly) have only one parent. And the first sheep (called Adam and Eve in this example) have no parents (or at least their parents were not sheep!) So the mother and father functions have to be (Sheep -> Maybe Sheep). To compute grandparents, great grandparents, etc. requires dealing with the Nothing possibility. So to compute maternalGrandfather we would have to do something like: maternalGrandfather s = case mother s of Nothing -> Nothing (Just a) -> father a Great-grandparents are messier still - get a structure like the (#) operator in the Parser. Ought to be a better way. There is - we can write a combinator that combines two functions that return Maybe: -- comb is a combinator for sequencing operations that return Maybe comb :: Maybe a -> (a -> Maybe b) -> Maybe b comb Nothing _ = Nothing comb (Just x) f = f x Using this, can say: maternalGrandfatherC :: Sheep -> Maybe Sheep maternalGrandfatherC s = (Just s) `comb` mother `comb` father fathersMaternalGrandmotherC :: Sheep -> Maybe Sheep fathersMaternalGrandmotherC s = (Just s) `comb` father `comb` mother `comb` mother mothersPaternalGrandfatherC :: Sheep -> Maybe Sheep mothersPaternalGrandfatherC s = (Just s) `comb` mother `comb` father `comb` father and so on. A big win! Demo on data in the example. This problem comes up in other areas, also. We read something of type IO String. We then want to do something like write the string to a file, which requires we get the String out of the IO String, then return something of type IO (). This is the problem that Monads solve. Monad has 4 operations, two of which have default definitions that usually work. class Monad m where (>>=) :: m a -> (a -> m b) -> m b return :: a -> m a fail :: String -> m a (>>) :: m a -> m b -> m b m >> k = m >>= \_ -> k fail s = error s The first is called bind, written >>=. It is exactly what comb was above - it takes a monad of type a and a function from type a to a monad of type b, and connects them up. Basically it applies the function to the value in the first monad to get the monad that is returned. The return function is poorly named. It should really be called "putIntoMonadForm" or some such thing. It takes a value v and makes it into a monad containing v. It is called return because it is often the last statement in a "do", which is syntactic sugar for monad operations. The fail function usually just returns an error message (see the default), but we will see a couple of cases where it does something different. The sequence operation >> just says to do one operation after another. This is the operation that lets us write things in a "do" in a way that they are done in the order written. We have seen that this is needed in I/O, where the prompt message needs to be printed before the requested value is read! The default value for "m >> k" looks mysterious, but it really isn't difficult. It says to start with the monad value m, take out the value inside the monad, and pass it on to an anonymous function that IGNORES the input value and performs function k. The result is that that k can't be evaluated until after m has been evaluated and passed on a meaningless value to it. The result is that m is performed before k. For Maybe, we have: instance Monad Maybe where Nothing >>= f = Nothing (Just x) >>= f = f x return = Just fail s = Nothing Note that >>= is defined exactly as our combinator was above. To put a value v into Maybe monad form we make it into (Just v). And the fail indicator is simply Nothing. Given this it is easy to modify the earlier functions: maternalGrandfatherM :: Sheep -> Maybe Sheep maternalGrandfatherM s = return s >>= mother >>= father fathersMaternalGrandmotherM :: Sheep -> Maybe Sheep fathersMaternalGrandmotherM s = return s >>= father >>= mother >>= mother mothersPaternalGrandfatherM :: Sheep -> Maybe Sheep mothersPaternalGrandfatherM s = return s >>= mother >>= father >>= father Note here that "return s" does NOT end the function call - it simply turns s into Just s. -- Monadic laws What laws should the monad functions follow? They are: return a >>= k = k a Return takes a and puts it into monad form. The bind should remove the a from the monad pass it to k, so the result is k a. m >>= return = m If m is a monad, passing it through bind will pass the monad's value to return, which converts that value back to monad form. It should be the same one! m >>= (\x -> k x >>= h) = (m >>= k) >>= h This is conceptually easy, but notationally complex. It is basically an associative law. You can perform action m and pass its value to the combined k x >>= h or you can evaluate (m >>= k) and pass its value to h. Same result. It is clearer in the special case of the sequence operator: m1 >> (m2 >> m3) = (m1 >> m2) >> m3 An additional law connects monads and functors, for types that are instances of both: fmap f xs = xs >>= return . f -- Conversion from "do" statements But a natural way to write this that is similar to what you would do in an imperative language would be using a do: maternalGrandfather :: Sheep -> Maybe Sheep maternalGrandfather s = do m <- mother s father m fathersMaternalGrandmother :: Sheep -> Maybe Sheep fathersMaternalGrandmother s = do f <- father s gm <- mother f mother gm mothersPaternalGrandfather :: Sheep -> Maybe Sheep mothersPaternalGrandfather s = do m <- mother s gf <- father m father gf This works correctly, because a "do" is simply syntactic sugar for monadic operations. The translations are: do e ==> e A single action is a do is simply performed. The do is superfluous. But it is the base case of our definitions below. do e1; e2; ...; en ==> e1 >> do e2 ; ...; en If it is a simple expression (action), just sequence it with a "do" of the rest. An example would be: do putStr "enter your name" name <- getLine becomes (putStr "enter your name") >> (name <- getLine) You just want to do them in the correct order. Let's look at the typical case of using <-, when the lhs is a variable: do x <- e1 ; e2 ; ...; en ==> e1 >>= \x -> do e2 ; ...; en Basically this says to perform action e1 and pass the resulting value as x in the anonymous function \x -> do e2 ; ...; en However, it is valid to use <- with general patterns. Therefore the real conversion is more complex, because it has to deal with failure to match. The book has the details. Consider the function above: maternalGrandfather :: Sheep -> Maybe Sheep maternalGrandfather s = do m <- mother s father m This translates to: maternalGrandfather s = mother s >>= (\m -> do father m) or maternalGrandfather s = mother s >>= (\m -> father m) which is the same as: maternalGrandfather s = mother s >>= father We saw that we can use "let" statement within a do when doing I/O. We deal with let as follows: do let decllist ; e2 ; ...; en ==> let decllist in do e2 ; ...; en So you can actually have a series of assignments in a let, but don't need an "in" because the translation includes the "in". It may be clearer to write the monad laws in do form: do x <- return a ; k x = k a do x <- m ; return x = m do x <- m ; y <- k x ; h y = do y <- (do x <- m ; k x) ; h y do m1 ; m2 ; m3 = do (do m1 ; m2) ; m3 fmap f xs = do x <- xs ; return (f x)