Information / CS 1971

The Complexity of Theorem-Proving Procedures

Stephen A. Cook

If you could solve SAT fast, you could solve a whole universe of hard problems fast.

Choose your version

In depth · the introduction

Thousands of famous problems look completely different — yet Cook proved they are secretly the same problem, and cracking any one of them would crack them all.

The big idea

Some questions are easy to check but seem terribly hard to solve. Give me a finished Sudoku and I can verify it in a minute; finding the solution from scratch can take ages. Computer scientists call the family of “easy to check, maybe hard to solve” questions NP.

Cook found a single problem — SAT, “Is there a way to set these true/false switches so a logical formula comes out true?” — and proved something startling: every problem in NP can be rewritten as a SAT question, quickly. So SAT is the hardest problem in the family. If anyone ever finds a genuinely fast method for SAT, that method would instantly make thousands of other hard problems fast too. Problems like SAT are called NP-complete: each one holds the whole family on its shoulders.

How it came about

By 1971 computers were fast, but a stubborn pattern had appeared: for whole families of problems — scheduling, routing, packing, puzzle-solving — every known method amounted, in the worst case, to trying all possibilities, and the number of possibilities exploded as the problem grew. Were these problems truly hard, or were people just not clever enough yet? Nobody could tell, because there was no way to compare one hard problem with another.

Stephen Cook, a young theorist who had just moved from Berkeley to the University of Toronto, gave the field that missing yardstick. In an eight-page paper at a 1971 conference, he defined what it means for one problem to be “no harder than” another, and then proved that satisfiability sits at the very top — every quickly-checkable problem reduces to it. The next year Richard Karp showed twenty-one famous problems were all in this same top tier; the idea spread like wildfire.

Why it mattered

Before Cook, “this problem seems hard” was just a feeling. After Cook, it became a precise claim you could prove: show your problem is NP-complete, and you have shown that a fast solution to it would solve a vast web of other problems no one has ever cracked — strong evidence to stop hunting for a perfect fast algorithm and look for good-enough shortcuts instead. The question of whether such fast solutions exist at all — P versus NP — became the central open problem of computer science, now carrying a million-dollar prize.

A way to picture it

Imagine a giant ring of locked doors, and a rumor that they all share one master key. Cook didn't find the key — he proved the rumor true: he showed that a key fitting the SAT door would fit every door in the ring. So no one needs to pick all the locks; the entire fate of the ring hangs on that single master door. Half a century on, no one has found the key, and many suspect it doesn't exist — but the doors are still provably linked.

Where it sits

Cook's paper is the keystone of a trio about the limits of computing. Alan Turing (1936) drew the line between what machines can and cannot do at all; Cook drew the line between what they can do quickly and what seems to take forever. Karp's 1972 catalogue then populated Cook's class with everyday problems, and Leonid Levin reached the same idea independently in the Soviet Union — which is why we say the Cook–Levin theorem. Every time your phone schedules tasks, routes a map, or your bank trusts that breaking its encryption is too slow to bother, you are living inside the landscape Cook charted.

The original document

Original source text

Summary

Stephen A. Cook · The Complexity of Theorem-Proving Procedures · STOC 1971, pp. 151–158

It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology.

Here “reduced” means, roughly speaking, that the first problem can be solved deterministically in polynomial time provided an oracle is available for solving the second. From this notion of reducible, polynomial degrees of difficulty are defined, and it is shown that the problem of determining tautologyhood has the same polynomial degree as the problem of determining whether the first of two given graphs is isomorphic to a subgraph of the second. Other examples are discussed. A method of measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.

P-reducibility

§1 · Tautologies and Polynomial Re-Reducibility

The set of tautologies (denoted by {tautologies}) is a certain recursive set of strings on this alphabet, and we are interested in the problem of finding a good lower bound on its possible recognition times. We provide no such lower bound here, but theorem 1 will give evidence that {tautologies} is a difficult set to recognize, since many apparently difficult problems can be reduced to determining tautologyhood.

Definition. A set S of strings is P-reducible (P for polynomial) to a set T of strings iff there is some query machine M and a polynomial Q(n) such that for each input string w, the T-computation of M with input w halts within Q(|w|) steps (|w| is the length of w) and ends in an accepting state iff w ∈ S.

Theorem 1 (and its corollary)

§1 · The reduction of every NP problem to satisfiability

Theorem 1. If a set S of strings is accepted by some nondeterministic Turing machine within polynomial time, then S is P-reducible to {DNF tautologies}.

Corollary. Each of the sets in definitions 1)–5) is P-reducible to {DNF tautologies}.

Proof of the theorem. Suppose a nondeterministic Turing machine M accepts a set S of strings within time Q(n), where Q(n) is a polynomial. Given an input w for M, we will construct a proposition formula A(w) in conjunctive normal form such that A(w) is satisfiable iff M accepts w. Thus ¬A(w) is easily put in disjunctive normal form (using De Morgan’s laws), and ¬A(w) is a tautology if and only if w ∉ S. Since the whole construction can be carried out in time bounded by a polynomial in |w| (the length of w), the theorem will be proved.

Theorem 2 — the equivalent problems

§1 · A first class of problems all equally hard

Theorem 2. The following sets are P-reducible to each other in pairs (and hence each has the same polynomial degree of difficulty): {tautologies}, {DNF tautologies}, D3, {subgraph pairs}.

Remark. We have not been able to add either {primes} or {isomorphic graphpairs} to the above list. To show {tautologies} is P-reducible to {primes} would seem to require some deep results in number theory, while showing {tautologies} is P-reducible to {isomorphic graphpairs} would probably upset a conjecture of Corneil’s from which he deduces that the graph isomorphism problem can be solved in polynomial time.

Discussion

§2 · Why this matters

Theorem 1 and its corollary give strong evidence that it is not easy to determine whether a given proposition formula is a tautology, even if the formula is in normal disjunctive form. Theorems 1 and 2 together suggest that it is fruitless to search for a polynomial decision procedure for the subgraph problem, since success would bring polynomial decision procedures to many other apparently intractible problems.

Furthermore, the theorems suggest that {tautologies} is a good candidate for an interesting set not in L*, and I feel it is worth spending considerable effort trying to prove this conjecture. Such a proof would be a major breakthrough in complexity theory.

Stephen A. Cook · University of Toronto · 1971