"Problem solving" has a rather restricted meaning in the artificial intelligence lexicon. A problem is said to exist when one is given a present state, a description of the characteristics of a desired state, and the operations that can be used to go from one state to another. There may be constraints specifying that certain states are not to be visited at any point. This formulation is known as the state-space approach. A few examples will show how generally applicable it is.
In the "missionaries and cannibals" problem, three missionaries and three cannibals wish to cross a river from the left bank to the right. They have available a boat which can take two people on a trip. Al can row. The problem is to get all six to the right bank, subject to the culinary constraint that at no time may the number of missionaries on either side of the river be exceeded by the number of cannibals on that side. To translate the puzzle into a formal problem, let a state be defined by the number of missionaries and cannibals on the left bank and the position of the boat. The starting position is (3, 3, L) and the goal state (0, 0, R). The permissible moves of the boat define the operators.
In theorem proving, one is given a set of true statements (premises) and a statement whose truth is not known (the theorem to be proved). By applying a sequence of allowable operations, such as the inference rules of algebra or trigonometry, expand the set of true statements to include the theorem. The states are defined by the statements proven thus far; the inference rules define the operators.
Given a chess position, change it into a position in which the opponent's king is checkmated. En route to this position, avoid any position in which your own king is checkmated or in which a stalemate occurs. The board positions define the states, and the piece moves the operator.
It has been said that modern artificial intelligence research began with the efforts of Allen Newell, Herbert Simon, and J C Shaw to write problem-solving programs (Fiegenbaum, 1968). Their studies were conducted jointly at the RAND Corporation and the Carnegie Institute of Technology (now Carnegie-Mellon University). Intellectually, the RAND-Carnegie work was important both in its own right and because it set the tone for many other efforts. Technologically, the programming methods developed in the course o the research have become widely used throughout computer science. Newell et al. (1957) first produced the Logic Theorist (LT), a program designed to prove theorems in the sentential calculus. The LT successfully proved 38 of the 52 theorems of Chapter 2 of Whitehead and Russell's Principia Mathematica. As the Principia is regarded as one of the basic works establishing the logical foundations of mathematics, LT's feasts were bound to attract interest. It is easy either to overstate or unduly minimise the significance of LT's success in reproducing the Principia's results (references are purposely omitted here), so one wants to be aware of precisely what was done.
The proofs of the theorems in Chapter 2 of Whitehead and Russell would not be considered deep by a mathematician. Bright university students can generally produce most of them. The brilliance of Whitehead and Russell's accomplishment was not in proving the theorems, but in realizing that their proof could be used as the basis for a development which leads from logic to mathematics. This realization was an act of creative insight which Newell, Shaw, and Simon made no attempt to mimic on a machine. On the other hand, the problems of Chapter 2 are not exactly trivial. These proofs probably are beyond the grasp off 60% of humanity, in fact one is likely to overlook if his circle of acquaintances is restricted to present and future Ph.D.'s. Newell et al, have repeatedly stated that the real significance of LT lies in how it produced proofs, not in the proofs it produced. Theorem proving is an example of a large class of problems for which there are solution techniques which are known to work, but which are not practical to execute. For example, the following exhaustive technique is guaranteed to produce a proof for any provable theorem - but it may take a while ...
Beginning with the premises, write down all inferences that can be made by combining two or more known true statements in various ways. Examine the set of statements so produced, to see if it contains either the theorem or its negation. In the former case, the theorem is proven; in the latter, it is disproven. If neither case occurs, add the new set of statements to the premises and repeat the procedure. There will be some number, n, such that a proof (or disproof) will be produced on the nth step, but there is no guarantee what n will be.
Newell, Shaw, and Simon called this procedure the "British Museum algorithm," since it seemed to them as sensible as placing monkeys in front of typewriters in order to reproduce all the books in the British Museum. They suggested instead following a heuristic approach, a term they took from Polya (1954, 1957), who believed that most mathematical proofs are achieved by guessing the nature of a solution, then proving that the guess is correct. 1 Twelve of the 14 unsolved problems were not completed because of the physical limitations of the computer then available. The others were beyond the capacity of the algorithm for logical reasons. A modified LT operating on a larger computer later solved all 52 theorems (Stefferud, 1963).
The procedure may not terminate if the "theorem" to be proven is not derivable from the axioms. Polya contrasted this with the algorithmic technique of mechanically going through steps which are bound, eventually, to result in the correct answer. The British Museum algorithm is clearly algorithmic. What Newell and his colleagues set out to do was to write a set of rules (i.e., a program) for generating guesses, then proving that they were correct. The idea caught on quickly, and today "heuristic programming" is spoken of respectfully, and somewhat mysteriously, as an advanced programming technique, even though it is not a programming technique at all, but rather a way of thinking about what a program is supposed to do when it is used to attack a certain problem.
In writing the Logic theorist, Newell, Shaw, and Simon encountered problems that forced them to develop an important new programming tool, list processing. This is a method of organizing a computer's memory so that ordered sets (lists), instead of variables, are the basic operands. List-processing techniques have found wide application throughout computer science (Knuth, 1969). It can also be argued that in thinking about complex information processing the appropriate language to use is a language for the manipulation of lists, since list processing has proven to be a very important tool in many applications involving symbolic rather than numeric processing.
The RAND-Carnegie approach to problem solving was characterized by a strong interest in how humans solve problems. A program is a precisely defined set of rules for attacking a given class of problems. Suppose a reasonable correspondence were to be established between the input and output of a program and the stimuli and responses observed in the psychological laboratory. Then one could say that, at the level of information processing, the program was a model of the man (Newell, Shaw & Simon, 1958; Newell & Simon, 1961, 1972). Simulating human thought, however, is a different goal from creating a good problem solver, since in one case, the criterion of success is that good problem solving be produced. Newell and Simon (1961) saw no incompatibility in the joint pursuit of these goals, and appear to have used knowledge of human problem-solving to suggest the structure of heuristic programs, and vice versa. Intuitively, this can be justified on the grounds that people are the most flexible problem solvers of which we have knowledge; so if we want to construct artificial intelligence, we should first study how the natural ones work.
The argument that programming should mimic human intelligence is weak if you are interested in solving problems within a specialized area in which "inhuman" methods may work well. The Newell et al. work on mathematical theorem proving was criticized on the grounds that more efficient exhaustive search methods than the British Museum algorithm existed, and that by using better exhaustive methods, better results could have been obtained (Wang, 1960). This criticism somewhat misses the point, since Newell and his co-workers were more interested in the generality of their methods than in substantive results in theorem proving. By the same reasoning, it was appropriate for Newell et al. to investigate the theorems of the Principia as example of a general class of problems, even though group theory might have provided problems that would be more interesting to a mathematician.
The skeptics' retort is that general problem solving skills, divorced from content areas, may not exist. Newell and his associates' next project, the General Problem Solver (GPS) program, attempted to show (a) that such skills do exist and (b) that they can be discussed at the very concrete level of computer programming. Whereas the LT had built into it the operations used in Whitehead 3 Polya did not give a formal definition of either algorithm or heuristic, nor did Newell et al. Currently within computer science the term algorithm is used to mean a procedure for operating on strings of input sentences from a specified set of legal strings (Glushkov, 1967). By this definition, any program is an algorithm, and the Newel et al. program should be thought of as algorithms for generating guesses. and Russel's formalization of the propositional calculus, the GPS was a program for manipulating states and operators in the abstract. To use the GPS on a specific problem, one first had to define the structure of specific states and operators (e.g., the locations of missionaries and cannibals and the moves of the boat) to the program. The act of specification was called describing the task environment. The GPS program was capable of attacking any problem that could be translated into a suitable task environment. The goal of GPS research has been to show that well specified generally applicable procedures (i.e., programs) lead to the sort of solutions which, when they are produced by humans, are applauded as clever. The list of problems attacked by the GPS and similar programs includes elementary logic, chess, high school algebra word problems, and the answering of questions phrased in somewhat ambiguous English, but confined to a very small data base. In one of the most extensive investigations the GPS was used to attack ten different small problems in fields ranging from symbolic integration to solution of the missionaries and cannibals puzzle (Ernst & Newell, 1969).
On several occasions Newell and Simon have offered comparisons of traces of a theorem proving program's progress toward solution with comments recorded as a person "thinks out loud" as evidence in support of their contention that their program modeled human thought. In 1972, they published a large body of such data and proposed a general technique for designing programs intended as general simulations (Newell & Simon, 1972).
About the same time that the RAND-Carnegie group began to attract attention, a closely related and similarly active group formed at the Massachusetts Institute of Technology, under the leadership of Marvin Minsky and John McCarthy. Subsequently McCarthy and Feigenbaum, a member of the Carnegie group, both moved to Stanford University, which, together with the Stanford Research Institute, has also become a major center of the artificial intelligence studies. Both directly through the research of members of the groups, and indirectly because of the prestige of the institutions involved, the M.I.T. and Stanford groups have heavily influenced the American study of computer problem solving.
In contrast to the early Carnegie-RAND studies, the M.I.T. research was more concerned with formal mathematical representations. Typical problems studied included symbolic integration (Slagle, 1963), question answering using trivial data bases (Raphael, 1964), the combining of deductive arguments and information retrieval (McCarthy, 1959; Slagle, 1965; Green, 1969), and that hardy perennial chess playing by machine (Greenblatt et al., 1967). McCarthy, Minsky, and their co-workers seem to see artificial intelligence as an extension of mathematics and symbolic logic, rather than as a parallel discipline to psychology. They have been extemely careful to formalize their computing procedures and to relate them to more conventional mathematics, in particular to recursive function theory (McCarthy, 1961; Berkeley & Bobrow, 1964). Attempts by Amarel (1968) and Banerji (1969) to formalize problem solving processes are also worth examination, as they represent independent efforts directed to the same goal. An emphasis on formalization is of great potential value, since an adequate formalization will be necessary if we are to have a precise, viable theory of the problem solving process. On the other hand, such a theory does not now exist, and it is our feeling that in many specific artificial intelligence projects the informal approach of the Newell-Simon variety would have been just as satisfactory as the often forbidding formalisms that have been presented.
Since 1965, the M.I.T. and Stanford groups have devoted a good deal of effort to the design and construction of robots, a problem which does not appear to have involved Newell, Simon, and their colleagues. Similarly, since 1965, the Newell-Simon group seems to have moved more toward the study of psychological problems. Naturally, any statement cannot cover all the members of a group.
A complementary approach to problem solving, based on the ideas of the mathematician J A Robinson (1965), has also greatly influenced the study of mechanical thought. Narrowly conceived, Robinson dealt only with a new approach to theorem proving in formal logic; but his methods can be applied to virtually any situation we normally call problem solving. As we shall see, the technical details of the method can be formidable. The basic logic is quite simple. Any statement can be proven true B showing that its negation is false. In theorem proving one must show that the statement.
A e B (1) is true, where A is the hypothesis of the problem and B the desired conclusion. Now suppose that A and B are sets of statements. If we wee to use the Newell and Simon approach to proving that (1) is true, we would begin with the set A0 of axioms, and then apply some selected rules of inference to produce a new set, A1 of axioms and inferred statements. Rules of inference would be applied to A1 to produce a second set, A2, and then a third, and fourth, until some A1 was found that contained B, the set B of statements to be proven, as a subset. An alternative way to prove (1) would be to show that its negation
7(A e B) = 7(7A c B) = (A C 7B) (2) was false. 5 This can be done by showing that the simultaneous assertion of A and 7B, leads to a contradiction. Robinson proposed that this be done by the use of a single rule of inference, called resolution. We introduce the idea with a trivial example. Consider the musical comedy problem:
If the lady is Charley's aunt, then she will visit with Charley. But the lady is never seen with Charley.
Using a rather obvious translation to the formalisms of logic we have the cclauses C1 (Aunt(lady, Charley) e Present(lady, Charley)) (3) C2 (7Present(lady, Charley)) 5
Throughout the text "7" will indicate negation, "c," disjunction, and "•," conjunction. Note that if B is a set of statements which, taken together are, to be interpreted as a single statement, then 7B is the disjunction of the negation of each statement in B. Let A stand for "Aunt(lady, Charley)" and P for "Present(lady, Charley)." The clauses of (3) are the axioms of the problem. The hypothesis is that the lady is not Charley's aunt. 6 this would be written (7A). To prove this conclusion by the resolution method we want to negate it [producing (A)], and add the resulting clauses to the axioms. Doing this, and rewriting clause C1 in disjunctive form, the set of clauses
C1 (7A c P), C2 (7P), C3 (A). (4) is produced.
We now introduce the second inference step in Robinson's method. Suppose two clauses are of the form (A c B) and (7A c C). It follows that
(A c B) C (7A c C) e (B c C) (5)
An inference of this sort is called a resolution of the clauses on the left of the e to produce the resolvent clause on the right. A contradiction exists if two clauses can be resolved to produce the empty clause, ( ). Inference of the empty clause is an indication that the original set of clauses contained a (perhaps implied) contradiction.
In view of this background, we now have two ways of proving by resolution that the lady is not Charley's aunt. In one line of proof clauses C1 and C2 are resolved to produce the clause (7A). This resolves with clause C3 to produce the contradiction. Discovery of the other proof is left to the curious.
In practice there are many complications which have not been illustrated. Nevertheless, it can be shown that resolution is a complete proof method. This means that if a contradiction can be obtained from a set of clauses by any valid proof method, then it can be obtained by resolution. In Polya's terminology, therefore, resolution is an algorithm rather than a heuristic. We shall see that heuristic techniques do play a prominent part in practical applications of the resolution method.
There have been very many amplifications of the basic resolution method. Quite apart from the specific results, one of the effects Robinson's work had was to shift the emphasis in artificial intelligence from attempts to mimic human solutions on machines to concern for machine-oriented problem solving methods. A second effect of Robinson's work has been a renewal of interest in practical applications of theorem proving techniques, especially to mechanized information retrieval.