編碼理論 1950

檢錯碼與糾錯碼

理查·漢明

幾個安插得當的冗餘位，讓一段資訊自己發現錯誤——並把它改對。

Choose your version

In depth · the introduction

一串 0 和 1，怎麼能察覺自己當中有一位被翻了——還在沒人盯著的時候，悄悄把它改回來？

把這個想法拆開看

電腦和電話線都說著「位元」——0 和 1——而每隔一陣，就有一位會翻轉：一個 0 變成了 1。光禿禿地把位元發出去，接收方無從知曉。經典的補救，是一個奇偶校驗位：多加一位，讓 1 的個數湊成偶數；這樣單個翻轉會讓它變奇，從而拉響警報。可警報只說「出事了」——不說是哪一位，也沒法把它修好。

漢明的飛躍，是不止多加一位，而是加上幾位，並安排得使每一位各自看守一組「彼此重疊」的位置。當某一位翻轉時，那幾個被觸發的警報，其特定的組合，會用二進位「拼」出出錯位的確切位置。知道了位置，接收方只需把它翻回去。資訊便自癒了——不必重發，也不必停機。

它從哪裡來

理查·漢明在貝爾實驗室工作，與克勞德·香農同處一隅。實驗室那些機械式的繼電器電腦，會在週末無人看管時跑他的作業——而只要有一個零件讀錯一位，整批運算便告中止，他星期一回來，便一無所獲。他想：機器既然能看出「發生了錯誤」，為什麼不能算出「錯誤在哪裡」並糾正它？正是那份週末的懊惱，催生了大約 1947 年的第一批糾錯碼；論文因專利審查，被壓到了 1950 年。

它為何重要

1948 年，香農剛剛證明：在有雜訊的通道上，無差錯通信在原理上是可能的——但他的證明沒說該怎麼造出來。漢明給出了「怎麼造」：一套具體、能跑的碼，所需的不過是數一數奇偶。這把可靠的數位儲存與通信，從一條定理，變成了一個能用的零件；此後每一套糾錯系統，都能追溯回這一步。

一個可以這樣想的畫面

想像在一條很差的電話線上，用「北約字母表」拼一個詞——「Alpha、Bravo、Charlie」。哪怕有一個音節被攪糊了，這些代號彼此差得太遠，聽者仍能聽出原本是哪一個。漢明碼對位元模式做的是同一件事：合法碼字被刻意拉得彼此「很遠」，於是翻轉任意一位，結果仍然離那唯一正確的資訊最近——接收方便能把它啪地撥回去。

它在何處落座

這是香農資訊論的「構造版」搭檔，本館中亦有收錄。香農（1948）量出了一條通道能承載多少可靠通信；漢明（1950）則造出了第一批真正去承載它的碼。這條線，一路延伸到里德—所羅門碼，再到如今在你手機裡緊貼著香農極限的 LDPC 碼與極化碼。

The original document

Original source text

R. W. Hamming · The Bell System Technical Journal 29(2): 147–160 · April 1950

Introduction

The author was led to the study given in this paper from a consideration of large scale computing machines in which a large number of operations must be performed without a single error in the end result.

The paper opens on this practical worry: a long machine computation is worthless if a single bit goes wrong, and a plain parity check can only sound an alarm and stop. Hamming sets out to do better — to build codes that not only detect an error but locate and correct it automatically, so the machine can keep running.

A geometric model

He recasts the problem geometrically. Each possible message of n binary digits is a point; the 'distance' between two such points is the number of positions in which they differ. An error displaces a point a small distance from the one that was sent. Choosing a code therefore means scattering the legal codewords so that the distance between any two of them is large enough that a corrupted word still sits unambiguously nearest to the one intended.

Single-error detecting and correcting codes

A minimum distance of 2 lets you detect a single error (a single flip can never reach another codeword) but not place it; a minimum distance of 3 lets you correct any single error. Hamming derives how much redundancy this costs — for a single-error-correcting code the number of check digits must satisfy 2^k ≥ n + 1 — and notes that the redundancy grows only slowly as the block lengthens.

[ … ]

A worked example: the (7,4) code

His showcase carries four message digits in a block of seven, with three check digits placed at the power-of-two positions 1, 2 and 4. Each check digit enforces parity over a fixed group of positions; on reception, the pattern of failed checks, read as a binary number, is exactly the position of the wrong digit, which is then corrected. The paper closes by adding one further check digit to also detect double errors — the construction now built into computer memory.

Bell Telephone Laboratories · 1950