JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
Back to the library
编码理论 1950

检错码与纠错码

理查德·汉明

几个安插得当的冗余位,让一段信息自己发现错误——并把它改对。

Choose your version
In depth · the introduction

一串 0 和 1,怎么能察觉自己当中有一位被翻了——还在没人盯着的时候,悄悄把它改回来?

把这个想法拆开看

计算机和电话线都说着「比特」——0 和 1——而每隔一阵,就有一位会翻转:一个 0 变成了 1。光秃秃地把比特发出去,接收方无从知晓。经典的补救,是一个奇偶校验位:多加一位,让 1 的个数凑成偶数;这样单个翻转会让它变奇,从而拉响警报。可警报只说「出事了」——不说是哪一位,也没法把它修好。

汉明的飞跃,是不止多加一位,而是加上几位,并安排得使每一位各自看守一组「彼此重叠」的位置。当某一位翻转时,那几个被触发的警报,其特定的组合,会用二进制「拼」出出错位的确切位置。知道了位置,接收方只需把它翻回去。信息便自愈了——不必重发,也不必停机。

它从哪里来

理查德·汉明在贝尔实验室工作,与克劳德·香农同处一隅。实验室那些机械式的继电器计算机,会在周末无人看管时跑他的作业——而只要有一个零件读错一位,整批运算便告中止,他星期一回来,便一无所获。他想:机器既然能看出「发生了错误」,为什么不能算出「错误在哪里」并纠正它?正是那份周末的懊恼,催生了大约 1947 年的第一批纠错码;论文因专利审查,被压到了 1950 年。

它为何重要

1948 年,香农刚刚证明:在有噪声的信道上,无差错通信在原理上是可能的——但他的证明没说该怎么造出来。汉明给出了「怎么造」:一套具体、能跑的码,所需的不过是数一数奇偶。这把可靠的数字存储与通信,从一条定理,变成了一个能用的零件;此后每一套纠错系统,都能追溯回这一步。

一个可以这样想的画面

想象在一条很差的电话线上,用「北约字母表」拼一个词——「Alpha、Bravo、Charlie」。哪怕有一个音节被搅糊了,这些代号彼此差得太远,听者仍能听出原本是哪一个。汉明码对比特模式做的是同一件事:合法码字被刻意拉得彼此「很远」,于是翻转任意一位,结果仍然离那唯一正确的信息最近——接收方便能把它啪地拨回去。

一个七位的汉明码字,画成上下两行格子——上面是发送的码字,下面是接收到的——其中位于第 1、2、4 位的三个校验位带有底色。切换四个信息位,再拖动滑块翻转信道中的任意一位;被翻转的格子变红,三个校验重新计算,而「症状」——那几个失败的校验读作一个二进制数——会圈出出错的确切位置,随后把它翻回,复原信息。

它在何处落座

这是香农信息论的「构造版」搭档,本馆中亦有收录。香农(1948)量出了一条信道能承载多少可靠通信;汉明(1950)则造出了第一批真正去承载它的码。这条线,一路延伸到里德—所罗门码,再到如今在你手机里紧贴着香农极限的 LDPC 码与极化码。

The original document
Original source text
R. W. Hamming · The Bell System Technical Journal 29(2): 147–160 · April 1950
Introduction
The author was led to the study given in this paper from a consideration of large scale computing machines in which a large number of operations must be performed without a single error in the end result.
The paper opens on this practical worry: a long machine computation is worthless if a single bit goes wrong, and a plain parity check can only sound an alarm and stop. Hamming sets out to do better — to build codes that not only detect an error but locate and correct it automatically, so the machine can keep running.
A geometric model
He recasts the problem geometrically. Each possible message of n binary digits is a point; the 'distance' between two such points is the number of positions in which they differ. An error displaces a point a small distance from the one that was sent. Choosing a code therefore means scattering the legal codewords so that the distance between any two of them is large enough that a corrupted word still sits unambiguously nearest to the one intended.
Single-error detecting and correcting codes
A minimum distance of 2 lets you detect a single error (a single flip can never reach another codeword) but not place it; a minimum distance of 3 lets you correct any single error. Hamming derives how much redundancy this costs — for a single-error-correcting code the number of check digits must satisfy 2^k ≥ n + 1 — and notes that the redundancy grows only slowly as the block lengthens.
[ … ]
A worked example: the (7,4) code
His showcase carries four message digits in a block of seven, with three check digits placed at the power-of-two positions 1, 2 and 4. Each check digit enforces parity over a fixed group of positions; on reception, the pattern of failed checks, read as a binary number, is exactly the position of the wrong digit, which is then corrected. The paper closes by adding one further check digit to also detect double errors — the construction now built into computer memory.
Bell Telephone Laboratories · 1950