人工智能 1958

感知机：大脑中信息存储与组织的一个概率模型

弗兰克·罗森布拉特

让一张细胞之网自己调整连接，机器便能从例子中学会识别图形。

Choose your version

In depth · the introduction

1958 年，一台房间那么大的机器学会了分辨形状——不是靠编程，而是靠被人「给它看例子、错了就纠正」。

把这个想法拆开看

1950 年代的每一台计算机，都严格照着程序所说的去做，一步接一步。弗兰克·罗森布拉特问了一个不同的问题：机器能不能像大脑看上去那样去学习——靠加强某些连接、削弱另一些，直到把事情做对？他的感知机，是一张由简单的、像细胞一样的单元连成的网络，每条连接的强度都可以调。

你不告诉它规则。你给它看一个例子，让它猜，再告诉它猜得对不对。猜错了，它就把自己的连接，朝正确答案的方向轻推一点。给它看足够多的例子，那些连接强度就会落定到一组「能把活干成」的设定上。这台机器，在真正意义上，学会了。

它从哪里来

罗森布拉特是康奈尔航空实验室（在纽约州布法罗）的一位心理学家。受大脑中神经元如何相连的启发，他先把感知机做成一套理论，再做成一台真机器——Mark I——由美国海军资助。它的「眼睛」，是一片由 400 个光传感器组成的栅格；它的「记忆」，是一排在学习时由小马达去拧的旋钮。

1958 年海军把它公之于众时，媒体沸腾了，报道说一台机器很快就能行走、说话、观看，并意识到自己的存在——这些预言，罗森布拉特本人也推波助澜过。现实更朴素，也更重要：一台机器，确实从例子中学会了识别图形。但「承诺」与「结果」之间的落差，埋下了反弹的种子。1969 年，两位 MIT 的研究者——马文·明斯基与西摩·帕珀特——证明了简单的感知机有一道很硬的数学局限，对整个想法的兴趣，由此坍缩了许多年。

它为何重要

感知机，是对一个如今主宰世界的问题的第一份可工作的答案：机器能不能从数据中学习，而非从手写的规则里学习？今天我们称作 AI 的几乎一切——认人脸、翻译路牌、人们与之交谈的聊天机器人——靠的正是这同一个原理，只是被放大了无数倍。感知机，是这一切的起点，而它引入的那个基本动作——猜、查、调——至今仍是其中的引擎。

像凭耳朵调音

想象不用调音器去调一根吉他弦。你拨一下，听出它有点偏低，就把弦钮拧紧一点点；再拨，再调，直到声音对了。你从不去计算精确的张力——你只是一直朝「减小误差」的方向移动。感知机也是这样学的：每一次猜错，都把它的「弦钮」——也就是连接强度——朝正确答案拧一点点。在下方试试：拖动滑块，看那条线自己「调」到位。

之前与之后

感知机的「全或无」单元，借自麦卡洛克与皮茨 1943 年的一个神经元模型；它与计算机时代的奠基思想一同长大——图灵的机器、冯·诺伊曼的存储程序设计，二者都在本馆中。但那些遵循的是明确写下的指令，感知机却在学习。在漫长的寒冬之后，这个想法回来了，这一次带着一套「能同时训练许多层」的方法，长成了 AlexNet（2012）与 Transformer（2017）背后的深度网络——它们也在这里。把它们连起来读，便看到一条不曾断裂的线索：从一台房间大小、眯着眼看形状的机器，一直到你也许正借以读到这段话的那个助手。

The original document

Original source text

Frank Rosenblatt · Cornell Aeronautical Laboratory · Psychological Review 65(6), 386–408 · November 1958

The question

If we are eventually to understand the capability of higher organisms for perceptual recognition, generalization, recall, and thinking, we must first have answers to three fundamental questions:

Rosenblatt then states them, almost word for word as follows (the three questions are quoted verbatim below).

How is information about the physical world sensed, or detected, by the biological system?

In what form is information stored, or remembered?

How does information contained in storage, or in memory, influence recognition and behavior?

He hands the first question to sensory physiology and concentrates on the other two. The paper contrasts two views of memory: a "coded representation" in which the stored image keeps a one-to-one correspondence with the stimulus, against the connectionist view that what is stored is a pattern of new or altered connections. The perceptron is built on the second view.

[ … ]

The perceptron

The paper describes a network of all-or-none units in three roles: sensory (S) units that respond to the stimulus, association (A) units that sum their inputs and fire past a threshold, and response (R) units that deliver the classification. The S-to-A wiring is random — the reason Rosenblatt calls the model "probabilistic" — while the A-to-R connections carry adjustable values. The system learns not by being reprogrammed but by reinforcement: when its response is wrong, the values are corrected toward the right answer, so that with experience the network comes to separate the patterns it is shown.

[ … ]

What it claims

Rosenblatt argues, with theorems and with simulations run on an IBM 704, that such a network can learn to recognize and generalize about patterns from examples alone — a statistical, brain-inspired alternative to programming every rule by hand. The full forty-page argument, its definitions, its convergence theorems, and the figures, are at the source below.

Cornell Aeronautical Laboratory · 1958