Assignment: Dartmouth pathfinder

In this assignment, you will create a graph that models part of the Dartmouth campus, find paths from one vertex in this graph to another vertex, and display them. Here is a scaled-down screenshot from the program:

This is a map of the Dartmouth campus and it has a graph overlaid on top of it, with vertices drawn at their locations on the map. Many of the vertices that are close together and have a footpath between them have an edge between them. The vertices are shown as blue circles. The edges are blue lines. To get this screenshot, I pressed and released the mouse button when it was on the vertex with a small red circle on it in front of Sudikoff. Then, with the mouse button not pressed, I moved the mouse over to a vertex near 1953 Commons. The program performed a breadth-first search on the graph and drew the resulting connecting path in red on the graph. My program also displayed the names of the starting and ending vertices, but that’s an easy extra-credit option.

Checkpoint

For the checkpoint, you have to do the following:

Define a Vertex class
Read the input file, and create a vertex dictionary for the given graph
Draw the vertices and edges of the graph on the given Dartmouth map using functions in cs1lib.py

How to do

Create a Vertex class to hold the information about each vertex in the graph. The class definition should be in its own Python file named vertex.py. A Vertex object should have instance variables that hold its name, its x and y locations on the map, and a Python list to hold the references of adjacent vertices in the graph. You’ll write some methods for the Vertex class but we’ll look at methods later.

Creating the vertex dictionary

Download the dartmouth_graph.txt file and add it to your assignment on PyCharm. This file has the pixel location of each vertex and the connections between vertices.

In a new file named load_graph.py, define a function to use the information from dartmouth_graph.txt file and create a vertex dictionary. Call this function create_vertex_dictionary. This function takes the name of the input file as a parameter.

The create_vertex_dictionary function should create one Vertex object per line in the data file and add a reference to each Vertex object it creates to a dictionary. The Vertex references (addresses) will be the values in the dictionary and the names of the vertices will be the keys. When done reading and processing all the information in the data file, the function should return the reference (address) of the dictionary, which will have information about all the vertices.

Your function create_vertex_dictionary should make two passes over the data file. (It may just read it in once and save all the lines in a Python list of strings if you like.) The first pass creates all the Vertex objects, storing the coordinates in the objects, and it also creates the dictionary mapping vertex names to references to Vertex objects. But the first pass doesn’t create the adjacency lists in the Vertex objects. Why not? Because you need to have created all the Vertex objects before you can create an adjacency list of references.

Once you’ve created all the Vertex objects in the first pass—but without filling in their adjacency lists— the second pass creates the adjacency lists in the Vertex objects. If you make two passes over the file, you must close it after the first pass and open it again before the second pass.

Let’s look more closely at the format and content of the data file. Here is the first line:

    Green Southwest; Robinson/Collis, Green South, Green West, Green Center, Green East; 510, 798

The name before the first semicolon, Green Southwest, is a name that uniquely identifies a vertex. This name goes into the Vertex object and is stored as an instance variable, and it also serves as the key into the dictionary that stores the addresses of all the Vertex objects.

After the first semicolon, there may be several names that identify the vertices adjacent to the current vertex. These names are separated by commas. So the vertices with names Robinson/Collis, Green South, Green West, Green Center, and Green East are adjacent to the vertex named Green Southwest.

After the second semicolon, there will be exactly two numbers, separated by a comma. These numbers are the x- and y-coordinates of the vertex on the map, in pixels, which you’ll want to store in instance variables of the corresponding Vertex object.

I would expect your create_vertex_dictionary function to do something like this. To deal with each line:

Split up the line into three pieces: the vertex name, the list of names of adjacent vertices, and the x- and y-coordinates.
Create a new Vertex object with the name and coordinate data stored as instance variables. (The name is a string, and the coordinates are integers.) The adjacency list, for now, is an empty list.
Put the new Vertex object into a dictionary with the string "Green Southwest" as the key.

Notice that we haven’t done anything with Robinson/Collis, Green South, Green West, Green Center, and Green East, the names of the vertices adjacent to this vertex. That’s because, as we’ve seen, their Vertex objects don’t exist yet (we’ve created a Vertex object only for Green Southwest), and so we cannot yet add them to the adjacency list of the Vertex object we have just created.

After looping over all lines in the file and creating all the Vertex objects with coordinates, it’s time for the second pass. Close the file and open it again and loop over all lines of the file again. For each line, do the following:

Get its name and the names of its adjacent vertices.
Look up the current vertex in the dictionary.
For each vertex that is adjacent to the current vertex, look it up in the dictionary using its name. When you look up a vertex in the dictionary, you get a reference to a Vertex object. For each adjacent vertex, append a reference to its Vertex object to the adjacency list in the Vertex object of the current vertex.

When create_vertex_dictionary has finished both passes, you should have a dictionary that has references to Vertex objects in it as values, accessed by using the keys Green Southwest (from the first line of the data file) through Robinson/Collis (from the last line of the data file). Each Vertex object should have instance variables filled in for its name, x- and y-coordinates and for its adjacency list—the Python list of references to adjacent Vertex objects, as indicated in the data file.

__init__ : This method will initialize the instance variables name, x- and y-coordinates using the given parameters and set the adjacency list to a empty list.
draw_vertex: This method takes floating point values r, g and b, defining a color as parameters. It draws a circle of the given color at x and y coordinates saved as instance variables.
draw_edge: This method takes a reference to another Vertex object and r, g, b as parameters, and it draws an edge between the Vertex that the method is called on (i.e., self) and the other vertex, in the color given by r, g, b.
draw_all_edges: This method takes floating point values r, g and b, defining a color as parameters, and draws all the edges between the vertex and all the vertices in its adjacency list, in the color given by r, g, b.

In addition to the methods, I define constants in my vertex.py file to set the radius of the circle drawn for each vertex and the width of the line drawn for each edge.

__str__: This method should return a string that is created by concatenating all the strings listed below exactly in the order given below:
```
 -   the vertex name
 -   a semicolon
 -   the string "Location: "
 -   the vertex's *x*-coordinate
 -   a comma
 -   the vertex's *y*-coordinate
 -   a semicolon
 -   the string "Adjacent vertices: "
 -   the names of all the adjacent vertices, separated by commas [Hint: You will need a loop in your `__str__` method].
```
Before each punctuation mark (colon, semicolon, and comma), there should be no space, and there should be exactly one space after. Note that there’s no comma after the last adjacent vertex. For example, for the first line of dartmouth_graph.txt, if you call str on its Vertex object, you should get the string:

Green Southwest; Location: 510, 798; Adjacent vertices: Robinson/Collis, Green South, Green West, Green Center, Green East

You probably won’t need the __str__ function after the checkpoint, but it allows us (and you) to determine whether your create_vertex_dictionary function works correctly.

In addition to the methods, I define constants in my vertex.py file to set the radius of the circle drawn for each vertex and the width of the line drawn for each edge.

Displaying the graph

Create another file, let’s call it map_plot.py. In this file add code to import and call the create_vertex_dictionary function to create a vertex dictionary. Then add code to do the following:

Draw the map background. The image file is dartmouth_map.png. Use the load_image and draw_image functions to load and display the map image. You should load the image only once; it’s a slow operation. Notice that the image is bigger than the default size for the graphics window. It’s 1012 pixels wide and 811 pixels high, so use these values to set the values of optional parameters width and height, respectively, when you call start_graphics.
Draw the graph. You have the dictionary containing information about the graph’s vertices. Each Vertex object has information that allows you to get x- and y-coordinates for the vertex, and it also has a list of references to Vertex objects for all adjacent vertices. You can loop over all items in the dictionary using the approach described in the notes.

In addition to drawing the vertices, you should draw the edges connecting the vertices. Because the graph is undirected, each edge will appear in two adjacency lists. It’s fine to draw each edge twice.To draw vertices and edges, you should use the methods in the Vertex class.

What to submit for the checkpoint

Submit a zip file containing the following:

vertex.py with Vertex class definition.
load_graph.py with definition of the function to create vertex dictionary.
map_plot.py with code to call the function defined in load_graph.py and draw the given Dartmouth map.
A screenshot of the Dartmouth map drawn when you run map_plot.py.

How to proceed after the checkpoint

Breadth-first search

To find a path from a start vertex to an end vertex in graph, you will implement the breadth-first search algorithm. Implement the breadth-first search algorithm as a function in a separate file, bfs.py. Your breadth-first search function should take references to the start and goal vertices as parameters and should return a path of vertices connecting them. To store the path, you can use a Python list of references to Vertex objects for all vertices on the path, including the start and the goal vertex.

You can use the pseudocode of the breadth-first search we discussed in class. Since you need to identify all the vertices on the path between the start and the goal, you will need to keep track of backpointers and to construct the path once the search has found the goal.

How to maintain the queue for the frontier

Recall that you need to maintain a queue for the frontier of your breadth-first search: the vertices that have been reached from the start vertex but have not yet been explored. The queue needs to be first-in, first-out.

You could use a Python list for the queue. Let’s call the list q. You would insert an item, say x, by calling q.append(x), and you would remove an item by writing del q[0]. The problem is that each time you remove an item from the queue, all other items have to shift to the next lower index, which takes time linear in the size of the queue. That’s not good. We want it to take constant time to insert an item into the queue and to remove an item from the queue. The Python collections module provides a class named deque that works as a queue with constant-time operations. In fact, it’s implemented internally with a linked list! A deque is a “double-ended queue”: you can insert into and remove from either end. You’ll use a deque as a regular queue, however. To insert, call the append method, just as you would on a Python list. To remove, call the popleft method; it takes no parameters and it removes the item in the queue that has been there the longest, returning this item. The len function returns how many items are in the deque.

There is one important thing to do. To understand it run the code given below to check what happens. Did you get an error? What was the error?

Example:

from collections import deque
q = deque()
q.append("Iowa")
q.append("Ohio")
print(len(q))
print(q.popleft())
print(len(q))
q.append("Idaho")
print(q.popleft())
print(q.popleft())
print(q.popleft())

As you can see, you should make sure not to call popleft when the queue is empty. You can use len function to do so.

How to keep track of backpointers

You can keep track of backpointers in one of two ways. One way is to add an instance variable for a backpointer in each Vertex object. You’ll have to initialize all the backpointer instance variables for all the vertices to None upon starting each breadth-first search. The other is to make a dictionary whose keys are references to Vertex objects and whose values are also references to Vertex objects. In particular, the value of a particular key is its backpointer. For the start vertex, its value should be None, since it has no backpointer but is on the path.

For the dictionary approach, just make an empty dictionary at the start of a breadth-first search.

How to determine whether a vertex has been visited

Breadth-first search needs to determine whether a vertex has already been visited because if it has, then it should not be inserted again into the queue for the frontier. One way would be to maintain a boolean in each Vertex object that tells you whether the vertex has been visited. But you’d have to initialize the boolean values for all vertices (other than the start vertex) to False each time you start a breadth-first search. There’s an easier way. A vertex has been visited if and only if it has a backpointer. If you store backpointers in a dictionary, then to determine whether a vertex has been visited, you just need to determine whether a reference to its Vertex object appears in the dictionary. Remember that you can use the in operator to determine whether an item is present in a dictionary.

Testing your implementation of the Breadth-first search algorithm

To test your implementation of the breadth-first search algorithm, do the following:

Import your BFS function from bfs.py into map_plot.py that you created for the checkpoint of this assignment.
After creating the vertex dictionary by calling create_vertex_dictionary function, get the reference to the Vertex object of the vertex named Sudikoff from the vertex dictionary. Let’s call it start_vertex. Get the reference to the Vertex object of the vertex named 1953 Commons from the vertex dictionary and let’s call goal_vertex.
To test your BFS implementation call your bfs function my passing the references start_vertex and goal_vertex to BFS function.
Your BFS function must return a Python list containing the references to vertices that make the path from start_vertex to goal_vertex. Get the name of each vertex in the path list (it is stored as an instance variable inside the Vertex object), and print it. How many vertices are on this path? [Hint: Check the screenshot I shared in the checkpoint of this assignment series.]

Once you are sure that your BFS implementation is working correctly, you can write code to select start and goal vertices by clicking on the vertices.

Selecting start and goal vertices, and running BFS

To select a vertex as the start vertex, the user should hover the mouse pointer over it and press and release the mouse button. If the user repeats this on a different vertex, then that should become a new start vertex.

Once a start vertex is picked, the user should hover the mouse pointer over another vertex (without clicking the mouse) to make that vertex the goal vertex.

How to do it

It is not always possible for a user to click precisely at the location of a vertex. You can call the mouse location as “on” a vertex if it is within the smallest square that surrounds the vertex. In other words, the mouse doesn’t have to be in the circle for the vertex but just in the smallest surrounding square; that makes the test for inclusion really simple.

To implement this strategy, the best way is to define a method is_on_vertex in the Vertex class. This method takes the x and y coordinates of the mouse location as parameters and returns a True if the mouse is on the vertex and False otherwise.

Finding the path from the start vertex to the goal vertex

Once the start and goal vertices are selected, your code should call your BFS function, to run the BFS algorithm on these vertices to find the path between them and display it on the Dartmouth map.

Drawing the path on the Dartmouth map

Once the BFS algorithm returns the path, draw the vertices on the path and the edges between the vertices in a color that is different from the color that you used to draw the vertices and edges in the rest of the map. You can use draw_vertex and draw_edge methods from the Vertex class to do so.

What to submit

Submit a zip file containing the following:

vertex.py with Vertex class definition.
load_graph.py with a definition of the function to create a vertex dictionary.
bfs.py with a definition of the function that implements the BFS algorithm.
Updated map_plot.py.
Two screenshots showing the path drawn between two different start and goal vertex pairs.

Extra credit ideas

As usual, I encourage you to be creative in coming up with ideas for extra credit. Here are some that I thought of.

Displaying vertex names: 5 points. I demonstrated this option in class, and it appears in the screenshot above. When you display the start and goal vertices of a path, display their names, which you have in the Vertex objects.
Animate breadth-first search: 15 points. Animate the breadth-first search process, so that we can see the breadth-first search unfold. Draw the start vertex in one color, the goal vertex in another color, and draw vertices that have been visited and searched from (that is, are no longer on the frontier) in another color, and vertices in the frontier on yet another color.
Breadcrumb search: 20 points. When searching from a specific start vertex for a specific goal vertex, the breadth-first search tree can get rather bushy. One way to cut down on the bushiness is to run two breadth-first searches simultaneously, one from the start vertex and one from the goal vertex. Each search leaves “breadcrumbs” at the vertices it has visited, and whenever a search from one of the two vertices finds a breadcrumb from the other search, it can complete a path.

Honor code

The consequences of violating the Honor Code can be severe. Please always keep in mind the word and spirit of the Honor Code.