Topic: ADTS, Map ADT, List implementation Date: Oct. 12, 2009 Number: 10 Examples: ListMap.hs Reading: Chap. 7 PS 2 due on Wed. We finished the material from last time on Trees. (Specifically, expression trees.) -- Abstract Data Type: Maps In Java, there are three basic collection interfaces. Lists: We have seen those in Haskell! Sets: They are implemented in a Haskell library, but we haven't seen them yet. Basic operations are insert, delete, "is element" testing, union, intersection, etc. The "elem" function on lists is basically using lists as a set representation. Maps: Also implemented in a Haskell library. Store (key, value) pairs. The keys are unique. The values need not be so. Basic operations are insert (key, value) pair and given a key look up and retrieve the corresponding value. So a basic staple of databases. Your student ID number is the key, all of your student record is the associated value. Association lists are maps. The "lookup" function is the way you retrieve the corresponding value. These Java interfaces define what is called an Abstract Data Type. You know the types of the data, and more important the OPERATIONS on the data. (Usually the same operations work on a wide class of data types, so should probably be called "abstract operation types". But too late to change the name now.) Idea is to hide the way that the data is represented. Before a paper by Parnas in the 70's everyone assumed that the natural thing to do was for every function to know how the data was represented and to work with that representation. With an ADT, you can perform certain operations, but you can't get at the data representation directly. Like an office with a window where a clerk sits and handles your requests to save or retrieve documents. Can't tell if the documents are saved in file cabinets, are scanned into a computer database, etc. Can have multiple implementations of the same ADT. In Java, Lists are implemented using Arrays and Linked Lists. Maps and Sets are implemented using Hash Tables (will learn in CS 19 or 25) and Binary Search Trees. The basic underlying premise: you should be able to swap one implemention of an ADT for another, and everything should continue to work! Speed and memory requirements may change, but function does not. (Haskell does not have anything as convenient as Interfaces, but the concept of a Type Class comes close.) We will see that we pick our ADT implementation by changing an import statement. There is a built-in Map implementation that uses a balance binary search tree as its data structure. The documentation is at: http://www.haskell.org/ghc/docs/latest/html/libraries/containers/Data-Map.html and to use it you: import Data.Map Actually, usually we will: import qualified Data.Map as Map The reason for "qualifying" it is because lots of functions in Map are built into Prelude, and their names clash. (Would be nice if you could use the same name, and the compiler could tell by data type!) Then you can say Map.insert and Map.lookup and it is clear which function you mean. And it doesn't matter whether it is ListMap or BSTMap or Data.Map that gets imported - the code talks about Map.something. (If don't use the "as" when used "qualified" you use the whole name, in this case Data.Map in front of the function name.) But back to our implementations of the Map ADT. Look at: module ListMap ( Map, (!), lookup, insert, member, insertWith, delete, size, empty, keys, toList, fromList ) where Note that the module name is ListMap, but the type is Map. Later I can use either import qualified Data.Map as Map or import qualified ListMap as Map and I will not need to make any other changes, because the type name is the same. The functions are: ! aMap ! key - returns the associated value, calls error if not present. lookup lookup key - Uses the Maybe type. Maybe is useful in situations like this. What should lookup return when key not present? Sometimes you want to go on, not die. What is a safe value? Anything you return could be an appropriate value in some context. So the type "Maybe a" can return either Nothing or (Just value) -- value is of type a You can then pattern-match on it. Nothing and Just are constructors. The Maybe type is defined in Standard Prelude as: data Maybe a = Nothing | Just a insert What you would expect. Replaces old value if insert a key again. insertWith f k v mp Lets you compute a function of the new key and the old one when you insert a key again. So you might keep a list of all the values ever inserted by letting f be (++). delete, size, toList, fromList do what you expect. (toList and fromList seem useless when the data structure is a list, but not when the data structure is a tree.) keys Returns a list of the keys in the map. empty Returns an empty map. Important point: all mutators (insert, delete, etc.) return the new map! Things we construct are immutable, so the only way to modify a map (or list or ...) is to construct a new one. You can reuse all unchanged parts of the old, because nothing else can modify them, either! So if you have a list: l1 = [1,2,3] l2 = 4:l1 l3 = 5:l1 the l1 list exists only once interally, and is shared in the representation. Looked at size, empty, keys, toList, fromList. Explained difference between (!) and lookup, and why Maybe is important. See above. -- Find the key and return the value, or Nothing if not present lookup :: Eq a => a -> Map a b -> Maybe b lookup k [] = Nothing lookup k ((k1, v) : rest) | k == k1 = Just v | otherwise = lookup k rest Note use of "pattern guards". After a pattern we can have a series of things of the form "| predicate = value". Haskell tests the predicates in order until it finds one that is True and then returns the value on the right side of the "=". If none is True, it is an error. "otherwise" is equivalent to True. So how to we use this? One example - member. -- Determines if key is in the Map member :: Eq a => a -> Map a b -> Bool member k t = case lookup k t of Nothing -> False (Just _) -> True Two things to note. First, we are pattern matching against Nothing and (Just v). If lookup returns Nothing the key wasn't there, so return False. If lookup returns Just the key was there. But what is this funny "case" statement, and why do we need it? Note that in lookup we pattern-matched on PARAMETERS. We can then give an expression, use guards to select between expressions, have an if-then-else, etc. But in member things are different. We don't want to pattern-match on parameters. We want to CALCULATE something and pattern-match on the calculated value. To do that we need this alternate "case" form. Same idea, different syntax. The expression to be evaluated goes between "case" and "of", and because the case is after the "=" (like a let), we use -> rather than = to separate the patterns from what we do with them. Let's look at (!): (!) :: Eq a => Map a b -> a -> b t ! k = case lookup k t of Nothing -> error "Key not found in map" (Just v) -> v Again use case, but choices are different. Here Nothing is an error, and (Just v) returns the value. This leaves the ones that change the map. First, delete. delete :: Eq a => a -> Map a b -> Map a b delete k [] = [] delete k (p@(k1, v) : rest) | k == k1 = rest | otherwise = p : delete k rest Note that because the map is immutable we can't just remove the value from the list. We have to re-build the part of the list before the thing to be removed. The recusion does this. When we find the thing to be removed (pair p, also called (k1, v) ) we drop it by returning the rest of the list without p. Otherwise we keep the item, and the rest of the list is what we get by deleting from the rest. Insert is interesting, because we have two versions. The question is what to do when you insert a key that is already there. One obvious choice is to replace it. But sometimes you might want to do something else - maybe keep a list of all values every associated with that key. In that case you want to be able to compute a function f of the new and old values. This is what insertWith allows: -- Insert key k and value v into this Map. -- Replace value with f(new_val,old_val) if key already present. insertWith :: Eq a => (b -> b -> b) -> a -> b -> Map a b -> Map a b insertWith _ k v [] = [(k,v)] insertWith f k v (p@(k1, v1) : rest) | k == k1 = (k, f v v1) : rest -- Replace value using f | otherwise = p : insertWith f k v rest Note if map is empty, we just insert it as the only item in the map. Otherwise we have two cases, just like delete. If it isn't the item we want, we keep our current p and insertWith in the rest of the list. If it is found, we replace it by (k, f v v1) and keep the rest of the list. Then insert is easy: insert :: Eq a => a -> b -> Map a b -> Map a b insert k v list = insertWith (\x y->x) k v list The anonymous function just returns the new key, so replaces the old.