人工智慧 1958

感知器：大腦中資訊儲存與組織的一個機率模型

法蘭克·羅森布拉特

讓一張細胞之網自己調整連接，機器便能從例子中學會辨識圖形。

Choose your version

In depth · the introduction

1958 年，一臺房間那麼大的機器學會了分辨形狀——不是靠編程，而是靠被人「給它看例子、錯了就糾正」。

把這個想法拆開看

1950 年代的每一臺電腦，都嚴格照著程式所說的去做，一步接一步。法蘭克·羅森布拉特問了一個不同的問題：機器能不能像大腦看上去那樣去學習——靠加強某些連接、削弱另一些，直到把事情做對？他的感知器，是一張由簡單的、像細胞一樣的單元連成的網路，每條連接的強度都可以調。

你不告訴它規則。你給它看一個例子，讓它猜，再告訴它猜得對不對。猜錯了，它就把自己的連接，朝正確答案的方向輕推一點。給它看足夠多的例子，那些連接強度就會落定到一組「能把活幹成」的設定上。這臺機器，在真正意義上，學會了。

它從哪裡來

羅森布拉特是康乃爾航空實驗室（在紐約州水牛城）的一位心理學家。受大腦中神經元如何相連的啟發，他先把感知器做成一套理論，再做成一臺真機器——Mark I——由美國海軍資助。它的「眼睛」，是一片由 400 個光感測器組成的柵格；它的「記憶」，是一排在學習時由小馬達去擰的旋鈕。

1958 年海軍把它公之於眾時，媒體沸騰了，報導說一臺機器很快就能行走、說話、觀看，並意識到自己的存在——這些預言，羅森布拉特本人也推波助瀾過。現實更樸素，也更重要：一臺機器，確實從例子中學會了辨識圖形。但「承諾」與「結果」之間的落差，埋下了反彈的種子。1969 年，兩位 MIT 的研究者——馬文·明斯基與西摩·帕珀特——證明了簡單的感知器有一道很硬的數學侷限，對整個想法的興趣，由此坍縮了許多年。

它為何重要

感知器，是對一個如今主宰世界的問題的第一份可運作的答案：機器能不能從資料中學習，而非從手寫的規則裡學習？今天我們稱作 AI 的幾乎一切——認人臉、翻譯路牌、人們與之交談的聊天機器人——靠的正是這同一個原理，只是被放大了無數倍。感知器，是這一切的起點，而它引入的那個基本動作——猜、查、調——至今仍是其中的引擎。

像憑耳朵調音

想像不用調音器去調一根吉他弦。你撥一下，聽出它有點偏低，就把弦鈕擰緊一點點；再撥，再調，直到聲音對了。你從不去計算精確的張力——你只是一直朝「減小誤差」的方向移動。感知器也是這樣學的：每一次猜錯，都把它的「弦鈕」——也就是連接強度——朝正確答案擰一點點。在下方試試：拖動滑桿，看那條線自己「調」到位。

之前與之後

感知器的「全或無」單元，借自麥卡洛克與皮茨 1943 年的一個神經元模型；它與電腦時代的奠基思想一同長大——圖靈的機器、馮·紐曼的儲存程式設計，二者都在本館中。但那些遵循的是明確寫下的指令，感知器卻在學習。在漫長的寒冬之後，這個想法回來了，這一次帶著一套「能同時訓練許多層」的方法，長成了 AlexNet（2012）與 Transformer（2017）背後的深度網路——牠們也在這裡。把牠們連起來讀，便看到一條不曾斷裂的線索：從一臺房間大小、瞇著眼看形狀的機器，一直到你也許正藉以讀到這段話的那個助手。

The original document

Original source text

Frank Rosenblatt · Cornell Aeronautical Laboratory · Psychological Review 65(6), 386–408 · November 1958

The question

If we are eventually to understand the capability of higher organisms for perceptual recognition, generalization, recall, and thinking, we must first have answers to three fundamental questions:

Rosenblatt then states them, almost word for word as follows (the three questions are quoted verbatim below).

How is information about the physical world sensed, or detected, by the biological system?

In what form is information stored, or remembered?

How does information contained in storage, or in memory, influence recognition and behavior?

He hands the first question to sensory physiology and concentrates on the other two. The paper contrasts two views of memory: a "coded representation" in which the stored image keeps a one-to-one correspondence with the stimulus, against the connectionist view that what is stored is a pattern of new or altered connections. The perceptron is built on the second view.

[ … ]

The perceptron

The paper describes a network of all-or-none units in three roles: sensory (S) units that respond to the stimulus, association (A) units that sum their inputs and fire past a threshold, and response (R) units that deliver the classification. The S-to-A wiring is random — the reason Rosenblatt calls the model "probabilistic" — while the A-to-R connections carry adjustable values. The system learns not by being reprogrammed but by reinforcement: when its response is wrong, the values are corrected toward the right answer, so that with experience the network comes to separate the patterns it is shown.

[ … ]

What it claims

Rosenblatt argues, with theorems and with simulations run on an IBM 704, that such a network can learn to recognize and generalize about patterns from examples alone — a statistical, brain-inspired alternative to programming every rule by hand. The full forty-page argument, its definitions, its convergence theorems, and the figures, are at the source below.

Cornell Aeronautical Laboratory · 1958