JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
Back to the library
編碼理論 1950

檢錯碼與糾錯碼

理查·漢明

幾個安插得當的冗餘位,讓一段資訊自己發現錯誤——並把它改對。

Choose your version
In depth · the introduction

一串 0 和 1,怎麼能察覺自己當中有一位被翻了——還在沒人盯著的時候,悄悄把它改回來?

把這個想法拆開看

電腦和電話線都說著「位元」——0 和 1——而每隔一陣,就有一位會翻轉:一個 0 變成了 1。光禿禿地把位元發出去,接收方無從知曉。經典的補救,是一個奇偶校驗位:多加一位,讓 1 的個數湊成偶數;這樣單個翻轉會讓它變奇,從而拉響警報。可警報只說「出事了」——不說是哪一位,也沒法把它修好。

漢明的飛躍,是不止多加一位,而是加上幾位,並安排得使每一位各自看守一組「彼此重疊」的位置。當某一位翻轉時,那幾個被觸發的警報,其特定的組合,會用二進位「拼」出出錯位的確切位置。知道了位置,接收方只需把它翻回去。資訊便自癒了——不必重發,也不必停機。

它從哪裡來

理查·漢明在貝爾實驗室工作,與克勞德·香農同處一隅。實驗室那些機械式的繼電器電腦,會在週末無人看管時跑他的作業——而只要有一個零件讀錯一位,整批運算便告中止,他星期一回來,便一無所獲。他想:機器既然能看出「發生了錯誤」,為什麼不能算出「錯誤在哪裡」並糾正它?正是那份週末的懊惱,催生了大約 1947 年的第一批糾錯碼;論文因專利審查,被壓到了 1950 年。

它為何重要

1948 年,香農剛剛證明:在有雜訊的通道上,無差錯通信在原理上是可能的——但他的證明沒說該怎麼造出來。漢明給出了「怎麼造」:一套具體、能跑的碼,所需的不過是數一數奇偶。這把可靠的數位儲存與通信,從一條定理,變成了一個能用的零件;此後每一套糾錯系統,都能追溯回這一步。

一個可以這樣想的畫面

想像在一條很差的電話線上,用「北約字母表」拼一個詞——「Alpha、Bravo、Charlie」。哪怕有一個音節被攪糊了,這些代號彼此差得太遠,聽者仍能聽出原本是哪一個。漢明碼對位元模式做的是同一件事:合法碼字被刻意拉得彼此「很遠」,於是翻轉任意一位,結果仍然離那唯一正確的資訊最近——接收方便能把它啪地撥回去。

一個七位的漢明碼字,畫成上下兩行格子——上面是發送的碼字,下面是接收到的——其中位於第 1、2、4 位的三個校驗位帶有底色。切換四個資訊位,再拖動滑桿翻轉通道中的任意一位;被翻轉的格子變紅,三個校驗重新計算,而「症狀」——那幾個失敗的校驗讀作一個二進位數——會圈出出錯的確切位置,隨後把它翻回,復原資訊。

它在何處落座

這是香農資訊論的「構造版」搭檔,本館中亦有收錄。香農(1948)量出了一條通道能承載多少可靠通信;漢明(1950)則造出了第一批真正去承載它的碼。這條線,一路延伸到里德—所羅門碼,再到如今在你手機裡緊貼著香農極限的 LDPC 碼與極化碼。

The original document
Original source text
R. W. Hamming · The Bell System Technical Journal 29(2): 147–160 · April 1950
Introduction
The author was led to the study given in this paper from a consideration of large scale computing machines in which a large number of operations must be performed without a single error in the end result.
The paper opens on this practical worry: a long machine computation is worthless if a single bit goes wrong, and a plain parity check can only sound an alarm and stop. Hamming sets out to do better — to build codes that not only detect an error but locate and correct it automatically, so the machine can keep running.
A geometric model
He recasts the problem geometrically. Each possible message of n binary digits is a point; the 'distance' between two such points is the number of positions in which they differ. An error displaces a point a small distance from the one that was sent. Choosing a code therefore means scattering the legal codewords so that the distance between any two of them is large enough that a corrupted word still sits unambiguously nearest to the one intended.
Single-error detecting and correcting codes
A minimum distance of 2 lets you detect a single error (a single flip can never reach another codeword) but not place it; a minimum distance of 3 lets you correct any single error. Hamming derives how much redundancy this costs — for a single-error-correcting code the number of check digits must satisfy 2^k ≥ n + 1 — and notes that the redundancy grows only slowly as the block lengthens.
[ … ]
A worked example: the (7,4) code
His showcase carries four message digits in a block of seven, with three check digits placed at the power-of-two positions 1, 2 and 4. Each check digit enforces parity over a fixed group of positions; on reception, the pattern of failed checks, read as a binary number, is exactly the position of the wrong digit, which is then corrected. The paper closes by adding one further check digit to also detect double errors — the construction now built into computer memory.
Bell Telephone Laboratories · 1950