Artificial Intelligence 1958

The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain

Frank Rosenblatt

Let a network of cells tune its own connections, and a machine can learn to recognize patterns from examples.

Choose your version

In depth · the introduction

In 1958 a machine the size of a room learned to tell shapes apart — not by being programmed, but by being shown examples and corrected when it got them wrong.

The idea, unpacked

Every computer of the 1950s did exactly what its program told it, one step at a time. Frank Rosenblatt asked a different question: could a machine learn the way a brain seems to — by strengthening some connections and weakening others until it gets things right? His perceptron was a network of simple, cell-like units wired together, with the strength of each connection adjustable.

You don't tell it the rule. You show it an example, let it guess, and tell it whether the guess was right. When it's wrong, it nudges its connections a little toward the right answer. Show it enough examples and those connection strengths settle into a setting that does the job. The machine has, in a real sense, learned.

Where it came from

Rosenblatt was a psychologist at the Cornell Aeronautical Laboratory in Buffalo, New York. Inspired by how neurons connect in the brain, he built the perceptron first as a theory and then as a real machine — the Mark I — funded by the U.S. Navy. Its eye was a grid of 400 light sensors; its memory was a bank of dials that little motors turned as it learned.

When the Navy unveiled it in 1958 the press went wild, reporting a machine that would soon walk, talk, see, and be conscious of itself — predictions Rosenblatt himself had encouraged. The reality was more modest and more important: a machine really had learned to recognize patterns from examples. But that gap between promise and result set up a backlash. In 1969 two MIT researchers, Marvin Minsky and Seymour Papert, proved that a simple perceptron had a hard mathematical limit, and interest in the whole idea collapsed for years.

Why it mattered

The perceptron was the first working answer to a question that now runs the world: can a machine learn from data instead of from rules written out by hand? Almost everything we call AI today — recognizing faces, translating signs, the chatbots people talk to — works on exactly this principle, scaled up enormously. The perceptron is where it began, and the basic move it introduced — guess, check, adjust — is still the engine inside.

Like tuning by ear

Picture tuning a guitar string without a tuner. You pluck it, hear that it's a little flat, and turn the peg a touch tighter; pluck again, adjust again, until it sounds right. You never calculate the exact tension — you just keep moving in the direction that reduces the error. The perceptron learns the same way: every wrong guess turns its "pegs" — the connection strengths — a little toward the right answer. Try it below: drag the slider and watch the line tune itself into place.

What came before and after

The perceptron borrowed its all-or-none cells from a 1943 model of the neuron by McCulloch and Pitts, and it grew up alongside the founding ideas of the computer age — Turing's machines and von Neumann's stored-program design, both in this Library. But where those followed explicit instructions, the perceptron learned. After a long winter the idea returned, now with a way to train many layers at once, and grew into the deep networks behind AlexNet (2012) and the Transformer (2017), also here. Read together, they trace one unbroken thread, from a room-sized machine squinting at shapes to the assistant you may be reading this through.

The original document

Original source text

Frank Rosenblatt · Cornell Aeronautical Laboratory · Psychological Review 65(6), 386–408 · November 1958

The question

If we are eventually to understand the capability of higher organisms for perceptual recognition, generalization, recall, and thinking, we must first have answers to three fundamental questions:

Rosenblatt then states them, almost word for word as follows (the three questions are quoted verbatim below).

How is information about the physical world sensed, or detected, by the biological system?

In what form is information stored, or remembered?

How does information contained in storage, or in memory, influence recognition and behavior?

He hands the first question to sensory physiology and concentrates on the other two. The paper contrasts two views of memory: a "coded representation" in which the stored image keeps a one-to-one correspondence with the stimulus, against the connectionist view that what is stored is a pattern of new or altered connections. The perceptron is built on the second view.

[ … ]

The perceptron

The paper describes a network of all-or-none units in three roles: sensory (S) units that respond to the stimulus, association (A) units that sum their inputs and fire past a threshold, and response (R) units that deliver the classification. The S-to-A wiring is random — the reason Rosenblatt calls the model "probabilistic" — while the A-to-R connections carry adjustable values. The system learns not by being reprogrammed but by reinforcement: when its response is wrong, the values are corrected toward the right answer, so that with experience the network comes to separate the patterns it is shown.

[ … ]

What it claims

Rosenblatt argues, with theorems and with simulations run on an IBM 704, that such a network can learn to recognize and generalize about patterns from examples alone — a statistical, brain-inspired alternative to programming every rule by hand. The full forty-page argument, its definitions, its convergence theorems, and the figures, are at the source below.

Cornell Aeronautical Laboratory · 1958