CS 76/276 – index

Here is a maze, drawn in the venerable tradition of ASCII art:

.......
.##....
..##...
.......
..##...
#.###..
....##.

The periods represent open floor tiles (places that can be walked on or over), and the number signs represents walls, places in the maze where a simple robot cannot go.

You have a robot. It can move in any of four directions: North (towards the top of the screen), East, South, and West. There are coordinates on the maze. (0, 0) is the bottom left corner of the maze. (6, 0) is the bottom right corner of the maze.

In this particular example, perhaps we’d like to plan a path for the robot to go through the maze from the bottom left corner to the bottom right corner, without going through any walls.

I’ve provided some code that will do things like load in mazes, print them out, and ask simple questions like (Maze.py) whether there is a robot or a floor tile at a particular location (there might be both).

As a warm-up, I’d recommend you create some example mazes, and use a simple BFS to search for a path from start to goal. You are welcome to use any code you like from your solution to the first assignment. This is just a warmup; the real work begins with A*, multi-robot path planning, and the blind robot problem, below.

Report

You’ll write your report in markdown again. I won’t give a detailed description of what should be in the report; it’s up to you to describe your implementation and results clearly and convincingly. Submit your .md, .pdf, and Python source files, bundled into a single zip. Please also include a very short README file that explains how to run your code.

A-star search

Implement A* search in SearchProblem.java (or wherever you like in Python). The description of the algorithm is in the textbook.

If you wish, you might implement Uniform-Cost-Search first. It’s very much like A*, but without the heuristic. Figure 3.14 in the book shows the algorithm. There is a bit of a problem, however. There is a step near the end:

if child.state is in frontier with higher path-cost
  replace that frontier node with child

How do you efficiently check if a node is in a priority queue, or replace it efficiently? It depends on the data structure used to implement the priority queue. A standard choice is a heap: an array that maintains items in a particular order based on their priority. (No, it’s not an array sorted by priority. That would be hard to insert new items into in a time-efficient manner. Heaps are more clever than that.)

Section 8.4.2 of the Python heapq documentation discusses exactly this issue, though in a slightly different context. The suggested solution is to mark the higher-cost item as removed, and add a new item with a lower cost.

It should be pointed out that this operation does incur some computational cost when compared to the approach in the book of removing the higher-cost item. The higher-cost item stays in the heap, and as items are added to the heap, the increased size of the heap will mean increased heapify costs.

The most efficient approach appears to use a Fibonacci heap, but we will not discuss Fibonacci heaps in this class. Here’s an interesting discussion at stack overflow.

An acceptable substitute for the purposes of this class is to do the marking as suggested by the Python docs. How do you mark things? If a state is being considered for addition to the frontier, check if it has been visited for a cheaper cost already. If not, add to frontier, and update the cost in a dictionary of state costs. Otherwise, do not add the node to the frontier.

Multi-robot coordination

k robots live in an n x n rectangular maze. The coordinates of each robot are (x_i, y_i), and each coordinate is an integer in the range 0…n-1. For example, maybe the maze looks like this:

. B . . . . .
. # # . C . .
. . # # . . .
. . . . # . .
. . # # . . .
. . # . . . .
A . . . # # .

That’s three robots A, B, C in a 7x7 maze. You’d like to get the robots to another configuration. For example:

. . . . . . .
. # # . . . .
. . # # . . .
. . . . # . .
. . # # . . B
. . # . . . A
. . . . # # C

There are some rules. The robots only move in four directions, north, south, east, and west. The robots cannot pass through each other, and may not occupy the same square. The robots move one at a time. First robot A moves, then robot B, then robot C, then D, E, and eventually, A gets another turn. Any robot may decide to give up its turn and not move. So there are five possible actions from any state.

Let’s make the cost function the total fuel expended by the robots. A robot expends one unit of fuel if it moves, and no fuel if it waits a turn.

(Potential bonus problem: have the robots move simultaneously, and/or use a cost metric of shortest total time. Write any bonus solution in a different Python file, with a filename starting with the word BONUS_ and describe your bonus work clearly in the writeup. )

Only one robot may occupy one square at a time. You are given a map ahead of time, and it will not change during the course of the game.

If there are k robots, how would you represent the state of the system? Hint – how many numbers are needed to exactly reconstruct the locations of all the robots, if we somehow forgot where all of the robots were? Further hint. Do you need to know anything else to determine exactly what actions are available from this state?
Give an upper bound on the number of states in the system, in terms of n and k.
Give a rough estimate on how many of these states represent collisions if the number of wall squares is w, and n is much larger than k.
If there are not many walls, n is large (say 100x100), and several robots (say 10), do you expect a straightforwards breadth-first search on the state space to be computationally feasible for all start and goal pairs? Why or why not?
Describe a useful, monotonic heuristic function for this search space. Show that your heuristic is monotonic. See the textbook for a formal definition of monotonic.

It may seem that you can just plan paths for the robots independently using three different breadth-first-searches, but this approach won’t work very well if the robots get close to one another or have to move around one another. Therefore, you should plan paths in the complete state space for the system.

Implement a model of the system and use A* search to find some paths. Test your program on mazes with between one and three robots, of sizes varying from 5x5 to 40x40. (You might want to randomly generate the 40x40 mazes.) I’ll leave it up to you to devise some cool examples – but give me at least five and describe why they are interesting. (For example, what if the robots were in some sort of corridor, in the “wrong” order, and had to do something tricky to reverse their order?)

You may include output of your program (as ASCII text) in the markdown to show that reasonable paths are found by your algorithm.

Describe why the 8-puzzle in the book is a special case of this problem. Is the heuristic function you chose a good one for the 8-puzzle?
The state space of the 8-puzzle is made of two disjoint sets. Describe how you would modify your program to prove this. (You do not have to implement this.)

Assume that you now only have a single robot in mazeworld, but there’s a catch. The robot is blind, and doesn’t know where it starts! The robot does know the map of the maze.

The robot has a sensor that can tell it what direction is North (so that it can still move in intended directions). However, the robot has no other sensors. No, it really can’t tell when it hits a wall.

Model and implementation

If you execute the action “west” from a configuration where “west” is blocked, the robot simply doesn’t move. In this problem, you will write a planning algorithm that gives a plan that even the blind robot can follow. Write an algorithm that causes the robot to go to the goal no matter where it starts. Specifically, let the ‘state’ in the search be the current set of possible locations of the robot. For example, in a 4x3 maze, the first state in the search will be the set of all 12 possible starting locations: { (0, 0), (1, 0), (2, 0), … (3,2) }. The actions will be {north, south, east, west}. The goal test function tests if the state is a singleton, with just one element in the set. If there is just one element, then the robot knows where it is.

Sometimes people ask questions like “can the robot tell the difference between a wall and the edge of the maze?” No. The robot can’t even tell that it hit something. All we know is that if you ask the robot to go west, it will either go west one square (if there is no wall there) or not (if there is a wall there). The robot doesn’t know which happened. You know the shape of the maze, however, and this will allow you to set the search up so that the resulting sequence of robot commands is robust to either situation.

Write a model of the search problem in Python and use the model together with the A* search algorithm you wrote previously to find motions of the robot. Animate the complete path, showing how the robot belief state changes as the path is executed. (By animate, I mean print out a sequence of text mazes with the robots in the proper locations.)

Try a few different mazes ranging from small (6 x 6) to somewhat larger.

Describe what heuristic you used for the A* search. Is the heuristic optimistic? Are there other heuristics you might use? (An excellent might compare a few different heuristics for effectiveness and make an argument about optimality.)

Hint. How does the successor function work? Take an example. Assume the current state is {(3, 0), (3, 1), (3, 2)}, which means the robot knows that it has x coordinate 3, but isn’t sure about the y coordinate. Consider the action “north”. If the location was initially (3, 0), then the location is (3, 1). If the location was initially (3, 1), then the new location is (3, 2). What if there is a wall north of (3, 2)? If the old location was (3, 2), then the new location is (3, 2), since the robot bumps into the north wall. So the new state is the set { (3, 1), (3, 2) }. In other words, the robot is now more sure of its location, even though it has no sensors.

You may use a few screen-captures of a Java-generated animation window to show that your algorithm generates good plans, even though screen-captures are bitmaps. In the text, you can describe plans as a sequence of actions. For example, “north, west, north, east, west, south, south” would be a plan for this system.

This next part is a worthy challenge. I expect you to make some attempt at it, but you can still score full points for the assignment without answering this question correctly.

Prove that a sensorless plan of the type you found above always exists, as long as the size of the maze is finite and the goal is in the same connected component of the maze as the start.

Now describe (but do not implement) a motion planner that runs in time that is linear or polynomial in the number of cells in the maze. (Notice that the size of the belief space is exponential in the number of cells, so this would be a very interesting result!)

I will give you a big hint. As the robot moves, the number of robot locations in the belief state tends to decrease. Given a map of a maze, can you always find a plan (or a sequence of plans) that decreases the number of robot locations by one? Such a plan would depend on the particular map, so some (polynomial or linear time) algorithm might have to be run in order to generate such a plan.

Literature Review (required for graduate students, bonus for undergrads)

This part of the assignment is required for graduate students in the class. Undergraduate students are very welcome to do this part (I think it quite interesting), and will receive some additional credit. (Please note in this section of the report whether you are a graduate student or not.)

The multi-robot problem from this assignment is a well-known one, and it has been studied extensively in the AI research community. I wrote my friend Professor Kostas Bekris at Rutgers who has done some work in this area, and he recommended the following five papers. Graduate students in the course should pick at least one of these papers, and read enough of it to get a sense of the approach. Briefly discuss the paper in a section entitled “previous work” in your report. (3-5 sentences will be sufficient for this discussion.) The discussion should describe the problem attacked by the paper, give a quick summary of the main result(s), and discuss the basic approach of the paper.

Here’s what Professor Bekris wrote me:

Some good foundational papers for the pebble problem are the following ones:

Auletta, V.; Monti, A.; Persiano, P.; Parente, M.; and Parente, M. 1999. A Linear Time Algorithm for the Feasibility of Pebble Motion on Trees. Algorithmica 23(3):223–245

Goraly, G., and Hassin, R. 2010. Multi-color Pebble Motion on Graphs. Algorithmica 58(3):610–636.

The above papers, however, are not following an A* search approach for the problem. If you are interested more in such a direction (and a complete/optimal solution) then one of the following papers may be a better reference:

Standley, T. 2010. Finding optimal solutions to cooperative pathfinding problems. In The Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI’10), 173–178.
Standley, T., and Korf, R. 2011. Complete algorithms for cooperative pathfinding problems. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One, IJCAI’11, 668–673. AAAI Press.

Sharon, G.; Stern, R.; Goldenberg, M.; and Felner, A. 2011. The increasing cost tree search for optimal multi-agent pathfinding. In The International Joint Conference on Artificial Intelligence (IJCAI’11), 662–667.

Grading rubric

Your code should be efficient and well-designed, with excellent style, formatting, comments. It should be brief if possible; a long chain of if statements is not good code if it can be replaced with something terser and easier to read. Follow good style guidelines, although we will not be strict here. For example, Python style guidelines suggest that there should be spaces after ‘#’, spaces around operators like ‘+’. Use whitespace! Group lines of code into logical units that are no more than 3-8 lines long using blank lines. Factor code out into functions. Make sure to claim authorship of your code in a comment at the top of the file (include date as well.)

Multi-robot coordination model: 3
Sensor-less robot model: 3
Heuristics for multi-robot and sensor-less problems: 3
A* implementation: 3
Overall code style and efficiency: 3
Discussion questions and discussion of implementation (report): 5
Extensions beyond the basic requirements: up to 5 points

The assignment is out of 25 points, but 20 points is a pretty good score.