Lecture 19:  Complexity Classes


  1. The Halting Problem
  2. Reduction
  3. The Class NP
  4. Complexity Classes

1. The Halting Problem

One of the requirements on an algorithm is that it must eventually stop with an answer. So far, when defining an algorithm, we have given an argument why a particular algorithm halts. Since we are in computer science, however, we can define the question itself as a problem and try to solve it with an algorithm.

Problem (Halting): Given an algorithm A and input I, does A halt on I?

We could now ask the first fundamental question: Does this problem have an algorithm? It turns out the answer is no.

Proof by Contradiction

We will prove that the halting problem does not have an algorithm using the technique of proof by contradiction. The main idea here is to assume the opposite of what we are trying to prove, and show that this leads into a paradox that we cannot resolve. In spirit, the technique is similar in trying to understand the truth value of the following sentence:

This sentence is false.

If the above sentence is true, then we have to believe what it is saying, namely that it is false. This is a contradiction as the sentence cannot be both true and false. On the other hand, if we assume this sentence is false, then we have to believe the opposite of what it is claiming, namely that it is false. Therefore, we have to believe it is true, reaching another contradiction. This technique is a classic one in logic and is named reductio ad absurdum in Latin, which translates roughly to "reduction to absurdity."

Now, let us do the proof using this technique. Suppose we have an algorithm for the problem called halts which takes two parameters alg and inp and returns if the algorithm alg halts on input input. In JavaScript, the function would look roughly like this:


function halts( alg, inp)
{
  // some code here that analyzes the algorithm alg on the input inp
  return true;       // alg halts on inp

  // maybe some more code

  return false;      // alg loops forever on inp
}

Remember, we do not know how halts works, but that we are assuming that it exists and works correctly on any algorithm and any input. In particular, we could call it on an algorithm with the algorithm itself as input. That is, halts( alg, alg) is either true or false. We will now show that our assumption will lead us to a contradiction. We begin with defining a new function called K as below:


function K( alg)
{
  if( halts( alg, alg)) {
    while( true);    // makes K loops forever
  } else {
    return;          // makes K halt 
  }
}

Now, the question we ask is: Does K( K) halt? When we call K with itself as the algorithm, the if statement calls halts( K, K) which must be true or false:

  • If halts( K, K) is true, it is saying that K halts. But we then make K halt forever using an infinite while loop.
  • If halts( K, K) is false, it is saying that K loops forever. But then we make K halt.

This means that halts cannot possibly work correctly on K as whichever answer it gives leads to a contradiction. Since the only assumption we made in this proof is that halts exists, that is the assumption that is causing us to reach a contradiction. Therefore, halts does not exist, and there is no algorithm for the halting problem.

Undecidability

We place all problems without algorithms into a single class. The halting problem is the first problem that was found belonging to this class. The existence of this problems means that computing is limited: We are not able to solve all problems using a computer: There are problems that are unsolvable by algorithms.

Concept 18: Undecidable problems do not have algorithms.

Note that not all hope is lost for these problems. We may be able to find algorithms that work for a subset of algorithms and inputs. For example, it is easy to see that any algorithm that has the infinite while loop in algorithm K may run forever. When we say a problem is undecidable, we mean that no algorithm works for all instances of the problem.

2. Reduction

Once we have a single undecidable problem, we may expand this class using the idea of reduction. Suppose we have a new problem called Problem A. We try to reduce the halting problem to A. That is, we show how we can solve the halting problem using any algorithm for A. Now, if A were decidable, that is, it had an algorithm, we could solve the halting problem, too. But since the halting problem is undecidable, we know that A cannot possibly have an algorithm. So, A is also undecidable, and we have expanded our class of undecidable problems.

Naturally, this only works if the halting problem is reducible to the new problem. Intuitively, what a reduction shows is that the new problem is even harder than the old problem, which is already undecidable. So, the new problem is also undecidable.

3. The Class NP

In Lecture 17, we discussed the class of problems called P consisting of problems that have polynomial time algorithms. If a problem is provably not in P, we lower our expectations and ask for a simpler property. The class NP consists of problems, where we have a polynomial time algorithm for verifying the answer. That is, if we get the answer somehow, we may at least check it in polynomial time.

It should be clear that all problems in P are also in NP, as if we can find the answer in polynomial time, we have already checked it, too. So, P is a subset of NP, although we do not know yet if NP is any larger, that is, if P is a proper subset of NP.

Traveling Salesman Problem

The Traveling Salesman Problem or TSP is a problem in NP.

Problem (TSP): Given a set of cities, is there a circuit that visits each city once and has length less than k?

Now, it is clear that once someone gives us a sample tour, we can check quickly if the tour has length less than k. Until today, no one has found a polynomial time algorithm for the TSP. The brute force algorithm of running through all possible solutions has running time O(n!), where n! is n factorial, a function that is definitely not polynomial and has faster growth than exponential.

TSP is such an important problem that it has entered popular culture. Here is a 100,000 city instance of TSP that represents Leonardo da Vinci's Mona Lisa as a single continuous loop. This representation is by Robert Bosch.

Mona Lisa TSP Image by Robert Bosch

And here is comparison of the complexity of selling using TSP versus on EBay by xkcd.

xkcd 399

NP-Complete

Using the idea of reduction, we may identify the hardest problems in NP. These problems are placed in a class called NP-Complete. The are complete in the sense that finding a polynomial time algorithm for a single problem in this class gives a polynomial time problem for all problems in NP. The problem TSP as defined above is an NP-complete problem.

P = NP?

Suppose we find a polynomial time algorithm for an NP-complete problem. We have then shown that all NP problems have polynomial time algorithms and belong to P. This, in turn, would show that P = NP. Until today, no such algorithm has been found. The question P ?= NP is a major unsolved problem in computer science.

The Clay Mathematics Institute included this problem as one of the 7 Millennium Problems in 2000, with a $1 million prize for solving it. There have been a number of attempts of deciding this problem in one way or another, including the latest attempt in August 2010.

4. Complexity Classes

Complexity Class Image

So far, we have defined a few complexity classes in the space of all problems. In order of complexity, the classes are:

  • P

    Problems with polynomial time algorithms.

  • NP

    Problems that are verifiable by polynomial time algorithms.

  • NP-Complete

    The hardest NP problems. Solving one of these in polynomial time implies P = NP.

  • Undecidable

    Problems without algorithms.

That is, we have the complexity map on the right for the space of all problems. On the bottom are the easiest problems, and on top are unsolvable problems.

Theoretical computer scientists have further partitioned this map into smaller complexity classes. On the bottom is a diagram showing other complexity classes. For example, the class PSPACE is the class of problems that require a polynomial amount of space (memory) to be solved. You may identify the classes we have seen so far in the diagram, including NPC, which stands for NP-Complete.

Currently, the Complexity Zoo lists 489 complexity classes, including the ones we have seen in this class: P ("The class that started it all"), NP ("The class of dashed hopes and idle dreams"), and NP-Complete.

Complexity Class Image