Topic: Type Classes Date: Nov. 11, 2009 Number: 23 Examples: Qualified-types.hs, Class-tour.hs Reading: Chapter 12, and skim Chapter 24 ------ Type Classes ---------- We have been using type classes for a long time now - they are what allow us to say things like: insertVertex :: (Eq a) => a -> Digraph a b -> Digraph a b But what are they actually, and what do they mean? And why did we have to change it to insertVertex :: (Ord a) => a -> Digraph a b -> Digraph a b in DigraphMap? Type classes are probably closest to Java interfaces. They guarantee that all members of the type class are able to perform certain operations. They are NOT much like Java classes - they define no instance or class fields, and one cannot create instances of them. Haskell is not OO. In the examples above, the first says that insertVertex won't work for all possible types - they only work for types in the class Eq. What is this class? It is defined in the Standard Prelude as: class Eq a where (==), (/=) :: a -> a -> Bool x /= y = not (x == y) x == y = not (x /= y) Only really need the first line - it gives the functions that any instance of class Eq must define. The rest is sort of clever. These are default definitions. But they are circular! The trick is that you only need to define one. Once you do, the other is defined by default. So some examples from book: instance Eq Integer where x == y = IntegerEq x y instance Eq Float where x == y = floatEq x y data Tree a = Leaf a | Branch (Tree a) (Tree a) instance Eq a => Eq (Tree a) where Leaf a == Leaf b = a == b Branch l1 r1 == Branch l2 r2 = l1==l2 && r1==r2 _ == _ = False So suppose I want to say that two Digraphs are ==. Well, first I have to define a data type - not just a type equivalence. But I could say: data AdjList v e = Adj [(v,e)] data Digraph v e = Graph [(v, AdjList v e)] So what would it mean for two AdjList to be ==? One possibility is that the lists are ==. This means that they are the same element by element. But this is not really what we usually mean by AdjLists being equal. The usual understanding is that two adjacency lists are equal if they have the same set of (vertex,edge) pairs. So we might say something like: instance (Ord v, Ord e) => Eq (AdjList v e) where (Adj adj1) == (Adj adj2) = sort adj1 == sort adj2 instance (Ord v, Ord e) => Eq (Digraph v e) where (Graph vs1) == (Graph vs2) = sortBy (comparing fst) vs1 == sortBy (comparing fst) vs2 (Note - needed sortBy, because I haven't defined AdjList to be an instance of Ord!) Can define OWN type classes. For instance, we have two contains, containsS for Shape and containsR for Region. To make a single contains function, could: class PC t where contains :: t -> Point -> Bool Then can declare: instance PC Shape where contains = containsS instance PC Region where contains = containsR Now I can use contains with EITHER class - the sort of polymorphism that we wished we had before, when we got name clashes on imports when we had several different data structures, each of which had its own insert or lookup. -- Inheritance -- Just as we can inherit from Interfaces, we can inherit from type classes. The easiest example is Ord, which inherits from Eq. class Eq a => Ord a where (<), (<=), (>=), (>) :: a -> a -> Bool max, min :: a -> a -> a (Actually more complicated, but see Chap. 24.) Note no mention of ==, /=. But Eq a => says that Ord must have all the functions required by ==. Example - compare trees. NOTE: On p. 161 of the book this example defines the operator (<), but fails to define (<=). This is an error. The way that Ord is defined the defaults require you to define (<=). If you do what the book does, then when you try to compare two trees using (>) you get infinite recursion! I added a definition for (<=) on two trees in terms of (<) as is done on p. 153. instance Ord a => Ord (Tree a) where Leaf _ < Branch _ _ = True Leaf a < Leaf b = a < b Branch _ _ < Leaf _ = False Branch l1 r1 < Branch l2 r2 | l1= b. But: t1 = Branch (Leaf 1) (Leaf 3) t2 = Branch (Leaf 2) (Leaf 2) are not comparable by this alternate definition. Things you expect of classes: Eq : Transitivity. (a == b) && (b == c) implies (a == c) (x /= y) = not (x == y) Symmetry: a == b implies b == a Reflexivity: a == a Ord: a <= a a <= b && b <= c implies a <= c a <= b && b <= a implies a == b a /= b implies a < b || a > b Why is it important to follow these rules? Users should be able to assume that they are true. The whole idea of type classes is to abstract similar relationships in disparate circumstances. The rules define what the "relationship" is. Have to be careful, though. A problem with floating point numbers is that they are often not exact, so == comparison can fail because of roundoff error. A typical way to get around the problem is to say that things are == if they are "close enough". That is, instance Eq Float where x == y = abs (x - y) < 0.0000001 Problem = not transitive. Consider a = 1.0 b = 1.00000008 c = 1.00000015 Then a == b and b == c, but a /= c Violated transitivity! -- Deriving Writing all of these tree functions is a pain. In fact, we seldom need to define how Eq, Ord, Show, etc. are to be implemented. Instead say "deriving (Eq, Ord, Show) or whatever is needed. There are built in rules that follow the normal expectations: For lists, tuples, etc., compare lexicographically. For Trees, etc. also compare lexicographically, but where is the order? Between different constructors, earlier constructors are smaller. In a declaration that includes multiple instances (e.g. Branch Tree Tree), go left to right. Note - USUALLY works, but not always. Consider digraph equality above. You could derive Eq and Ord, but what you would get would not be the usual interpretation of digraph equality -- Show and Read -- Show converts data into a string. Read reverses the process. Note that this means that show will make the output look like what read wants to see. So show of a string has " " around it. If there is a quote in the string, it is escaped: \" Also escape newline: \n and tab: \t. Can even escape back slash! \\ We can take advantage of this to write a self-printing program. Consider: main = putStr (quine q) quine s = s ++ show s q = "main = putStr (quine q)\nquine s = s ++ show s\nq = " Can do in most languages, but Haskell is quite clean. Make the whole program a string (up to the point of the string), and then print the string twice, once without escapes and the second time with escape characters! -- Other important built-in classes Num : Lots of stuff: class (Eq a, Show a) => Num a where (+), (-), (*) :: a -> a -> a negate :: a -> a abs, signum :: a -> a fromInteger :: Integer -> a Note: no division in Num. Also, all must have fromInteger class (Num a, Ord a) => Real a where toRational :: a -> Rational class (Real a, Enum a) => Integral a where quot, rem, div, mod :: a -> a -> a quotRem, divMod :: a -> a -> (a,a) toInteger :: a -> Integer Note: Integral classes must have toInteger, and have rem, mod, div, quot class (Num a) => Fractional a where (/) :: a -> a -> a recip :: a -> a fromRational :: Rational -> a class (Fractional a) => Floating a where pi :: a exp, log, sqrt :: a -> a (**), logBase :: a -> a -> a sin, cos, tan :: a -> a asin, acos, atan :: a -> a sinh, cosh, tanh :: a -> a asinh, acosh, atanh :: a -> a class (Real a, Fractional a) => RealFrac a where properFraction :: (Integral b) => a -> (b,a) truncate, round :: (Integral b) => a -> b ceiling, floor :: (Integral b) => a -> b class (RealFrac a, Floating a) => RealFloat a where floatRadix :: a -> Integer floatDigits :: a -> Int floatRange :: a -> (Int,Int) decodeFloat :: a -> (Integer,Int) encodeFloat :: Integer -> Int -> a exponent :: a -> Int significand :: a -> a scaleFloat :: Int -> a -> a isNaN, isInfinite, isDenormalized, isNegativeZero, isIEEE :: a -> Bool I can never remember it all, but know where to find it. The important points are that Int and Integer are instances of Integral, Double and Float are instances of RealFloat. Table on p. 156 shows the whole mess and what classes are part of what. Other stuff (walk through quickly): Full thing for Ord: class (Eq a) => Ord a where compare :: a -> a -> Ordering (<), (<=), (>=), (>) :: a -> a -> Bool max, min :: a -> a -> a compare x y | x == y = EQ | x <= y = LT | otherwise = GT x <= y = compare x y /= GT x < y = compare x y == LT x >= y = compare x y /= LT x > y = compare x y == GT max x y | x >= y = x | otherwise = y min x y | x < y = x | otherwise = y data Ordering = LT | EQ | GT deriving (Eq, Ord, Enum, Read, Show, Bounded) Note: Must define either <= or compare when you define Ord. Defining <, >, or >= will not work, given the default definition for compare. Have seen colors: data Color = Red | Orange | Yellow | Green | Blue | Indigo | Violet Enum used when there is a way to enumerate. For things like Color, the order in which the constructors are defined is the order used. These underlie the list abbreviations like [1..n]. class Enum a where succ, pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] -- [n..] enumFromThen :: a -> a -> [a] -- [n,n'..] enumFromTo :: a -> a -> [a] -- [n..m] enumFromThenTo :: a -> a -> a -> [a] -- [n,n'..m] -- Minimal complete definition: toEnum, fromEnum succ = toEnum . (+1) . fromEnum pred = toEnum . (subtract 1) . fromEnum enumFrom x = map toEnum [fromEnum x ..] enumFromThen x y = map toEnum [fromEnum x, fromEnum y .. ] enumFromTo x y = map toEnum [fromEnum x .. fromEnum y] enumFromThenTo x y z = map toEnum [fromEnum x, fromEnum y .. fromEnum z] When there are min and max values (so things like Int and Color): class Bounded a where minBound :: a maxBound :: a