In 1959, John McCarthy noted that while interesting work was being done to solve problems requiring a high level of human intelligence, many simpler verbal reasoning processes had not yet been implemented using machines.
Taking inspiration from the field of formal logic which dates back to Aristotle (384–322 BC), McCarthy sought to design a machine with “common sense”. He proposed a program, named the Advice Taker, that could draw conclusions and improve from a set of premises (“advice”) defined in a formal language. Unlike previous research on the subject , McCarthy wished to describe the program’s procedures and heuristics in rich detail. The motivation behind this approach was to create a machine with the ability to learn from experience as effectively as humans do and enable discovery of abstract concepts through relatively simple representations.
McCarthy briefly mentioned that one known way to make machines capable of intelligent behaviour is for them to simulate all possible actions and test the results. Behaviours can be represented using nerve nets , Turing machines  or calculator programs . His criticism of these targeted the low frequency of encountering interesting behaviour and the fact that small changes in behaviour expressed at a high level of abstraction do not have simple representations.
This led him to define a set of features he deemed essential for the evolution of human level intelligence:
McCarthy’s paper focused mainly on the second point. To begin, he stated that in order for a program to be capable of learning something it must be capable of being told it. He then made the distinction between the way an engineer would instruct a computer program to complete a task, through imperative commands, compared with the declarative way we would instruct a human. Declarative sentences have the advantages of being able to use previous knowledge, they have logical consequences, order is less important which allows afterthoughts and they are less dependent on the previous state of the system, meaning that the instructor requires less knowledge of the previous state.
The Advice Taker program possesses the following key features:
As an example, McCarthy described a scenario where you are at your desk and wish to go to the airport. Before any deduction could take place and a solution to the problem obtained, the following a priori premises would be input to the Advice Taker:
at(x, y)and it’s transitivity
at(x, y), at(y, z) → at(x, z)
did(go(x, y, z)) → at(I, y)
walkable(x), at(y, x), at(z, x), at(I, y) → can(go(y, z, walking))
drivable(x), at(y, x), at(z, x), at(car, y), at(I, car) → can(go(y, z, driving))
(x → can(y)),(did(y) → z) → canachult(x, y, z)
canachult(x, y, z), canachult(z, u, v) → canachult(x, prog(y, u), v)
prog(y, u)represents the execution of those actions to obtain v.
x, canachult(x, prog(y, z), w), want(w) → do(y)
Given the above rules, facts and a goal, the Advice Taker should deduce the argument below. The final proposition would initiate action to achieve the goal:
at(I, desk) → can(go(desk, car, walking))
at(I, car) → can(go(home, airport, driving))
did(go(desk, car, walking)) → at(I, car)
did(go(home, airport, driving)) → at(I, airport)
canachult(at(I, desk), go(desk, car, walking), at(I, car))
canachult(at(I, car), go(home, airport, driving), at(I, airport))
canachult(at(I, desk), prog(go(desk, car, walking), go(home, airport, driving)) → at(I, airport))
do(go(desk, car, walking))
But how would the initial premises be collected and the deduction routine operate? McCarthy conceded he could not yet provide a full explanation of this but explored some high level ideas in the remainder of the paper:
the only statement on M has the form want(u(x)).
wanthad arisen before, the observation routine earlier may have added it from M to O as an object with properties of statements that are relevant to building an argument, thereby generalising from past experience. In this case we assume that it has not been encountered before and does not take the status of an object.
want(at(I, airport))is related to getting somewhere and search for appropriate stored rules and facts that could assist in the deductive argument.
walkable(x), at(y, x), at(z, x), at(I, y) → can(go(y, z, walking))and
canachultpremises referenced above that relate to doing something to obtain a new state.
want(at(I, x)) → do(observe(whereamI))should be found, causing the Advice Taker to invoke a general whereami routine to obtain the first premise
want(at(I, x). One property may be a rule that begins with the premises
want(I, x)and the conclusion to search for the property list of
go(y, x, z).
drivabledefinition to be found with one of it’s premises
walkablerule, which completes the set of premises since the other
atpremises would have been found as by-products of previous searches, enabling the argument above to be derived.
McCarthy hoped that the heuristic rules mentioned on the property lists were plausible to the reader. He concluded with the observation that many of the statements encountered were of stimulus-response format and obeying these rules could be likened to unconscious human thought. Conscious thought on the other hand, could be viewed as the process of identifying and deducing logical conclusions from a set of premises.
The final section presented criticism of the paper from Prof. Y. Bar-Hillel who claimed that the work belonged to the Journal of “Half-Baked Ideas” and was careless in it’s specification. McCarthy remarked that he was not proposing a practical real world problem for the program to solve but rather an example intended to allow us to think about the kinds of reasoning involved and how a machine may be made to perform them.
References & Sidenotes
You can read the original paper here. I’d also recommend reading John McCarthy’s legacy which provides further analysis of the paper and considers the impact of John McCarthy’s work on formal knowledge systems and Artificial intelligence as a whole.
 Newell, A., Shaw, J. C. and Simon, H.A.(1957). Empirical Explorations of the Logic Theory Machine. A case Study in Heuristic. Proceedings of the Western Joint Computer Conference, published by the Institute of Radio Engineers, New York, 1957, pp. 218–230.
 Minsky, M.L. (1956). Heuristic Aspects of the Artificial Intellegence Problem. Lincoln Laboratory Report,pp.34–55.
 McCarthy, John (1956). The Inversion of Functions Defined by Turing Machines, in Automata Studies, Annals of Mathematical Study No. 34, Princeton, pp. 177–181.
 Friedberg, R. (1958). A Learning Machine, Part I IBM Journal of Research and Development 2, No. 1.