Topic: Breadth First Search, Queues, Stacks Date: Oct. 23, 2009 Number: 15 Examples: Stack.hs, Queue1.hs, Queue.hs Reading: -- Kevin Bacon Game (or Erdos Numbers) -- Idea - build graph with actors as vertices and edges between actors if they appear in a movie together. (Edge label is the movie.) Find the shortest path from any actor to Kevin Bacon. For the geekier version, vertices are authors of mathematical papers and you get an edge between two mathematicians if they published a joint paper together. (Edge label is the joint paper.) Find the shortest path from any mathematician to Paul Erdos. Remarkably, there are a number of people who have both small Erdos numbers and small Bacon numbers. Check out: http://en.wikipedia.org/wiki/Erdős–Bacon_number Dan Kleitman has total Erdos-Bacon number of 3 (Erdos 1, Bacon 2), but the Bacon number is due to a role as an extra. Danica McKellar has an Erdos-Bacon number of 6, and is both a professional actress and wrote a published math paper. Or could use to analyze social networks like Facebook or MySpace - vertices are members and have an edge from person 1 to person 2 if person 1 lists person 2 as a friend. (Note this last version can be directed, if "friending" is not required to be reciprocol, while the other two always have an edge in both directions if there is an edge in one direction. We will only deal with undirected graphs in this lecture.) But once you build the graph, you want to find shortest paths between a root (e.g. Kevin Bacon) and everyone else in the graph. Turns out that the easiest way to do this is to build a shortest-path tree directed back towards the root (sometimes called parent links). Then look up the person and follow parent links to get back to the root, reporting every edge on the path. Basic idea: 1) Start with the root (Kevin Bacon). Make him the only vertex in an otherwise empty graph T (because it will hold a tree of edges). 2) For all vertices connected to root (in its adjacency list), add the vertex to the new graph T and add an edge back to the root (labeled appropriately). 3) For all vertices next to the level-1 vertices, if the vertex is not already in the graph (could have edges between two level-1 vertices), add to the graph and connect back to the level-1 vertex. 4) For all vertices next to level-2 vertices, if they aren't in the graph T (so haven't been found yet) add them to T with edges back to the vertices that they came from. ... until no reachable vertices left. Draw example, do the operations. So how do we do this? Easiest way is to use a data structure called a Queue. First-in, first-out. Then idea is: 1) Put root into the queue and into a new graph T 2) Until the queue is empty dequeue to get next vertex v to process for each thing (v', e') in v's adjacency list in G if v' not in T add v' to T and add an edge labeled e' from v' to v enqueue v' 3) When you are done, T holds a shortest-path or BFS tree. To find the Bacon number of anyone, look them up in T. If not there, not connected to root. If is there, follow edges back, counting (or printing) movies (edge labels) and actors (vertices) along the way. --- Queues Queue is a standard ADT. Used to simulate people standing in line, handle queues of jobs waiting for a printer in an operating system, etc. Operations are: makeQueue (create empty queue) enqueue (add to end) dequeue (remove the front item) isEmpty (true if queue is empty). There is an easy way to implement a queue. Keep a list of things in the queue. Add to the end, remove from the front. Code below. data Queue = Queue [a] deriving Show -- Create an empty queue makeQueue :: Queue a makeQueue = Queue [] -- Add x to end of queue, returning the updated queue enqueue :: a -> Queue a -> Queue a enqueue x (Queue s) = Queue (s ++ [x]) -- return the front of the queue paired with the updated queue dequeue :: Queue a -> (a, Queue a) dequeue (Queue (x:s)) = (x, Queue s) dequeue _ = error "Cannot dequeue an empty Queue" -- returns true if queue is empty isEmpty :: Queue a -> Bool isEmpty (Queue []) = True isEmpty _ = False Note that we can't modify the queue. We have to return a modified version, which we then pass on. Just like the PQ in PS 3. (But that separated the getMin from the deleteMin. This shows the other approach - return a pair.) There is a problem with this, though. Appending a single item to the end of a queue takes time proportional to the length of the queue. That is bad. Enqueueing and dequeueing n items can take O(n^2) time! So how can we do better? Adding to front and removing from rear is no better. There is a better way, though. But first I need to introduce another ADT, the Stack. Operations are: makeStack (make empty stack) push (add to TOP of stack) pop (remove from TOP of stack) isEmpty (test for empty stack) Stacks are last-in, first-out. Stacks are used for runtime environments to handle recursion. They are also used for thing like expression evaluation without using recursion. They can also be easily implemented with lists: data Stack a = Stack [a] deriving Show -- Creates an empty stack makeStack :: Stack a makeStack = Stack [] -- Adds a new value to the top of a stack, returning the updated stack push :: a -> Stack a -> Stack a push x (Stack s) = Stack (x:s) -- Removes the top value in a stack, returning both the value and -- the updated stack as a pair. pop :: Stack a -> (a, Stack a) pop (Stack (x:s)) = (x, Stack s) pop _ = error "Cannot pop an empty stack" -- returns true if stack is empty isEmpty :: Stack a -> Bool isEmpty (Stack []) = True isEmpty _ = False Note that both push and pop only deal with the first item in the list, so both are constant time! Why am I talking about stacks when we want a queue? Well, there is a way to use TWO stacks to implement a queue so that each item is pushed twice and popped twice. Therefore enqueuing and dequeuing n items takes O(n) time. Show how the idea works by enqueuing and dequeuing a number of items. The code: module Queue (Queue (Queue), makeQueue, enqueue, dequeue, isEmpty) where import qualified Stack as S data Queue a = Queue {inStack :: S.Stack a, outStack :: S.Stack a} deriving Show -- Create an empty queue makeQueue :: Queue a makeQueue = Queue S.makeStack S.makeStack -- Add x to end of queue, returning the updated queue enqueue :: a -> Queue a -> Queue a enqueue x q = Queue (S.push x (inStack q)) (outStack q) -- return the front of the queue paired with the updated queue dequeue :: Queue a -> (a, Queue a) dequeue q@(Queue inSt outSt) | (isEmpty q) = error "Cannot dequeue an empty Queue" | (S.isEmpty outSt) = dequeue (Queue outSt (transfer inSt outSt)) | otherwise = let (x,s) = S.pop outSt in (x, Queue inSt s) -- returns true if queue is empty isEmpty :: Queue a -> Bool isEmpty q = S.isEmpty (inStack q) && S.isEmpty (outStack q) -- Transfer everything in stack 1 to stack 2 and return the new s2. transfer :: S.Stack a -> S.Stack a -> S.Stack a transfer s1 s2 | S.isEmpty s1 = s2 | otherwise = let (x, s) = S.pop s1 in transfer s (S.push x s2) Note that enqueueing is easy - always push onto inStack. dequeue is trickier. If the outStack is not empty, we simply pop something off of it. But if it is empty, we tranfer everything from the inStack to the outStack. This reverses their order, so they come off in the opposite order that they went on - the last inserted goes to the bottom, the one that was inserted first goes to the top. All of these will be popped before anything not inserted yet.