CS 76 / 176

Week 1: Uninformed Search

ProfessorDevin Balkcom
devin@cs.dartmouth.edu
office: Sudikoff 206

http://www.cs.dartmouth.edu/~cs76

This course

The goal of this course is to give a perspective on one of the most misunderstood areas of computer science.

You'll write a lot of code, too.

A toolkit

We will look at several standard problems and algorithms.

Model A.I. problems
Represent knowledge or information
Design systems for making plans, inferences, choices

What is AI?

Games/search vs. data
the idea of state
agents: the sense-plan-act paradigm
search: considering possibilities
planning over states
estimation of state

Pitfalls

(How A.I. is viewed.) Over-emphasis on:

Ideas with catchy titles. Genetic algorithms. Neural nets.
Thinking humanly, at the cost of thinking rationally.

There are many good ideas here, but beware of snake oil.

Thinking about A.I. problems

Identify the state.
Is the state observable? How?
How does the state change? What are the rules? Who controls state change?
How can the state be represented? Simple numbers? First-order logic? A probability distribution?

State vs. knowledge

Knowledge agents. A richer representation of state, perhaps with some component of belief about the world, or about rules about the world.
Logic: A language for describing rules or knowledge

Examples

Games: chess, go, backgammon, cards
Robotics
Computer vision
Natural language processing
Machine learning? Computational biology?

Agents and search

An agent begins in some state, the initial or start state.
The agent would like to get to some goal state.
The agent has certain actions available
The agent knows the results of each action
The agent might have some preference for "better" paths

States and nodes are often created 'on the fly'.

Formal search problems

A start state
A `goal_test` function that checks if a state is a goal state
A `get_actions` function that finds the legal *actions* from some state and a `transition` function that accepts a state, an action, and returns a new state, or alternatively, a `get_successors` function that returns a list of states given a starting states (wrapping get_actions and transition)
A path_cost function that gives the cost of a path between a pair of states.

Transition functions

Can be tricky. One (perhaps inefficient) way: internally generate some superset of the possible actions from a state. An `is_legal` function can then be used on the hypothetical state; if the state is legal, then the `get_actions` adds the action to a list.

Breadth-first search (tree)

In order to reconstruct the path to the goal after the goal has been found, some book-keeping has to be done as the search progresses.

A node data structure is an object that contains:

state
backpointer to other nodes

Pseudocode: BFS on a tree

frontier = new queue
pack start state into a node
add node to frontier

while frontier is not empty:
  get current_node from the frontier
  get current_state from current_node

  if current_state is the goal:
    backchain from current_node and return solution

  for each child of current_state:
    pack child state into a node, with backpointer to current_node
    add the node to the frontier

When done, backchain.

BFS (graph)

frontier = new queue
pack start_state into a node
add node to frontier

explored = new set
add start_state to explored

while frontier is not empty:
  get current_node from the frontier
  get current_state from current_node

  if current_state is the goal:
    backchain from current_node and return solution

  for each child of current_state:
    if child not in explored:
      add child to explored
      pack child state into a node, with backpointer to current_node
      add the node to the frontier

return failure

Discussion

What is the state for each of these problems? How many dimensions (variables) would you use to represent the state of each problem?

mazeworld: a robot at integer coordinates (x, y) in a maze
Three robots in mazeworld, not allowed to collide
8-puzzle
Eight queens
Rubik's cube

(Figure from Wikipedia.)

DFS (non-recursive, with visited set)

depth_first_search(start_state, goal_test_fn, transition_fn,
              get_legal_actions_function)

  frontier = new stack
  pack start_state into a node
  add node to frontier

  explored = new set
  add start_state to explored

  while frontier is not empty:
    get current_node from the frontier
    get current_state from current_node

    if current_state is the goal:
      backchain from current_node and return solution

    for each child of current_state:
      if child not in explored:
        add child to explored
        pack child state into node, add backpointer
        add the node to the frontier

  return failure

Comparing BFS, DFS

Finite state space of size $n$:

Run-time for each is $O(n)$
Memory for each is $O(n)$. Consider explored, frontier.

Infinite state space with reachable goal:

Runtime for BFS is $O(b^d)$, if $d$ depth of goal and $b$ branching factor.
DFS probably won't terminate.

In finite state spaces, BFS typically gives shorter paths. In infinite state spaces, DFS fails.

Path-checking DFS

We have seen memoizing BFS and DFS.

For BFS, memoizing memory cost is not so bad. Frontier is already big: $O(n).$
For DFS, memoizing seems expensive, since frontier is tiny: $O(d * b)$.

Can we avoid building complete explored set for DFS?

Path-checking: keep track of states on the current path only, and don't loop.

Does not eliminate redundant paths. Consider large number of loop-free paths in a graph: $O(b^d)$. Ouch.

Depth-limited DFS

Memoizing and path-checking DFS fail in infinite spaces.

Simple solution: bound depth of search, $l$.

When would you use memoizing DFS?

When would you use path-checking DFS?

Iterative Deepening

DFS to depth 1
DFS to depth 2
DFS to depth 3
...

Short paths, and good memory characteristics, but:

Each new DFS throws away results of previous work (unlike BFS).
Even in a particular layer, explores redundant paths.

Discussion question #2

$n$: state space size, $b$: branching factor,
$d$: depth of goal, $m$: length of longest path

What are the time and space complexities of BFS, DFS-mem, DFS-pc, and IDS, in terms of $b$, $d$, $m$?

Comparing search

$n$: state space size, $b$: branching factor,
$d$: depth of goal, $m$: length of longest path

Alg.	Opt	Time	Space
BFS	y	$O(\min(n, b^d))$	$O(\min(n, b^d))$
DFS-mem	n	$O(\min(n, b^m))$	$O(\min(n, b^m))$
DFS-pc	n	$O(\min(m^n, m b^m))$	$O(m)$
IDS	y	$O(\min(d^n, m b^d))$	$O(d)$

Searching from the goal

Sometimes you can search backwards:

a single identifiable goal
inverse transition function available

But why?

Search from highly constrained states.
Bi-directional search.

Constrained goal states

This can matter a lot in continuous-state environments. (Refrigerator-mover's problem.)

Bi-directional search

$(d/2)^2 + (d/2)^2 = d^2 / 2 < d^2$

"Twice" as fast on a graph in "2" dimensions.