Topic: Currying, Sections, Anonymous Functions Date: Jan. 16, 2009 Number: 7 Examples: appendReverse.hs, moreHOF.hs Reading: Chap. 9 PS 1 due on Wed. Next class: finish Chap. 9, go to Chap. 6. ---- Functional programming ideas in real life: When Google wants to create an index for web pages (or lots of other stuff that they do) they run the program on thousands of computers in parallel. The way that they control all of this is a program called "Map-Reduce". It is proprietary, but there is a simpler open-source version available. "Reduce" is another name for "fold". So Map-Reduce first maps a computation over a huge data base spread over hundreds or thousands of computers, and then "Reduce" combines the individual answers to get a global answer. ------ Why two folds? It sometimes matters, either because it is incorrect one way (e.g. foldr (-) 0 lst won't give you want you expect!) or for efficiency. Consider the append function: (++) :: [a] -> [a] -> [a] -- Note () around operator [] ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys) Time for x ++ y? Proportional to length x. Note length y doesn't figure into it! So what way should be group x ++ y ++ z? x ++ (y ++ z) Proportional to the sum of the lengths - each element in x and y on left side of exactly one ":" But if do it (x ++ y) ++ z the elements in x are involved on 2 ":"s - one when append y, again when append z to (x ++ y). So Haskell defines infixr 5 ++ (says ++ is an infix operator that associates right (the "r") with priority 5. Higher priority happens first.) What does this have to do with foldl, foldr? There is a builtin concat function: concat :: [[a]] -> [a] concat xxs = foldr (++) [] xxs If use foldr on n lists of length len, time is n*len, so linear time in the length of the list. If used foldl on n lists of length len, (say [lst1 lst2 lst3 ... lstn]), then list # Number of (:) with list elemt on left side n 0 n-1 1 n-2 2 ... 2 n-2 1 n-1 so n(n-1)/2 times len. (Book lost the /2). ---- Return to the reverse function, which we saw on the second or third class. Straightforward definition (this time working on general lists instead of just strings): reverse :: [a] -> [a] reverse [] = [] reverse (x:xs) = reverse xs ++ [x] We noted that this takes O(n^2) time: concatenate to end of lists of length 0, 1, 2, ..., n-1. But we did better by creating an auxilary parameter to accumulate the reversed list: reverse xs = rev [] xs where rev acc [] = acc rev acc (x:xs) = rev (x:acc) xs Keep taking first thing off the list and putting on front of acc, returning acc when done. O(n). TRICK that comes up often. In fact, we saw it with foldl: foldl op init [] = init foldl op init (x:xs) = foldl op (init `op` x) xs Similar, if foldl = rev, init = acc, and `op` = (:). But rev we have (x : acc) while in foldl we have (init `op` x), which would map to (acc : x) if we substitute. Backwards! So define: revOp a b = b : a So it reverses the order of the parameters to (:). Then reverse xs = foldl revOp [] xs Takes O(n) time. I told you earlier that foldl and foldr could be used in situations where you would not expect to use them. This is one example. ---- Will see something strange when you look at Chapter 23, where you can find LOTS of useful list functions. In there we have (approximately): sum :: (Num a) => [a] -> a sum xs = foldl (+) 0 xs What is that "(Num a) =>" for? Means that for sum to make sense, a must be a number. Similarly (Eq a) and (Ord a) mean that equality tests and inequality tests must make sense on the type a. Type classes - we will see later. -- Currying If you look at the solution to SA 3, you see two pieces of code: = map makeSq [sm, (sm+step) .. lg] where makeSq r = (makeSquare clr ctr r) and graphicsToIO w list = map drawGraphic list where drawGraphic g = drawInWindow w g In both cases I defined an auxillary function where I took another function (makeSquare, drawInWindow) and supplied one or two of the parameters and left one to be supplied. Lots of the functions in the last SA had the same property. Would be nice if we could say something like: = map (makeSquare clr ctr) [sm, (sm+step) .. lg] or graphicsToIO w list = map (drawInWindow w) list We can! The process of creating a new function by supplying one or more arguments to a function with more parameters is called "currying", after the logician Haskell Curry, who popularized the idea. (Actually invented by Schoenfinkel.) (Want to guess where our programming language got its name?) Why does this work? Same reason that we give the type signature of makeSquare: makeSquare :: Color -> Point -> Radius -> Graphic instead of makeSquare :: (Color, Point, Radius) -> Graphic which is more what we would do in mathematics. Function application is left associative, so makeSquare clr ctr r is really (((makeSquare clr) ctr) r) So (makeSquare clr) is a function which is applied to ctr, gives the function (makeSquare clr ctr). So I could have defined: makeSq = makeSquare clr ctr There is one parameter not supplied to makeSquare, so that means that makeSq is a one-parameter function. Notice that we don't even need to name that parameter. (Book calls leaving out the parameter name the "currying simplification", but notes that the formal name is "eta simplificaton".) So I could have defined mapLen = map length A one-parameter function. Enter it into ghci, demo. We can simplify our definition of reverse above: reverse xs = foldl revOp [] xs The function flip is defined in the Standard Prelude: flip :: (a -> b -> c) -> b -> a -> c flip f x y = f y x Then we can define: revOp acc x = flip (:) acc x which can be simplified to: revOp = flip (:) Supplied 1 of 3 parameters to flip, so 2 left, and revOp is a function of 2 parameters. So we can then write reverse as: reverse = foldl (flip (:)) [] We have supplied 2 parameters to foldl, so 1 left. -- Sections Currying can be extended to binary infix operators, as we have seen on the first day: (+1), (>0) (`op` y) is equivalent to f1 x = (x `op` y) (x `op`) is equivalent to f2 y = (x `op` y) (`op`) is equivalent to f3 x y = (x `op` y) We have been using the first form since day 1, and now we see why enclosing an operator in () is consistent with making it a function that can be passed. (Doesn't explain why () when defining, but nice that it is all consistent!) So second form is new, but not complicated. -- Anonymous functions Currying and sections make it easier to supply arguments to higher-order functions like map and foldl and foldr. But sometimes you just want to map a function like: f2 (x,y) = x + y or zipWith a function like f1 x y = (x + y)/2 It is a pain in the neck to have to define it just so that you can use it one place in a map or a zipWith. It would be useful if you could create an unnamed (thus anonymous) function "on the fly" and pass it as a parameter. You can. \ (x, y) -> x + y \ x y -> (x + y)/2 So I might define: pairwiseAverageLists = zipWith (\ x y -> (x + y)/2) or notSame = filter (\(x,y) -> x /= y) reversePairs = map (\(x,y) -> (y,x)) Church used a notation similar to this called lambda notation, because the first character was the Greek letter lambda. (Look at the home page to see an example in the title in a green circle.) The \ was as close to lambda as they could get on a standard keyboard. So can define sections: (x+) => \y -> x + y (+y) => \x -> x + y (+) => \x y -> x + y