A Mathematical Theory of Communication
Information can be measured in bits — and sent flawlessly, even through a noisy channel.
Shannon found a way to measure information itself — in bits — and proved you can send a message perfectly even down a noisy, crackling line.
The idea, unpacked
Shannon's first insight sounds strange: to handle communication with mathematics, ignore what a message means. What counts is how surprising it is. A symbol you could have guessed tells you almost nothing; a surprising one tells you a lot. He turned that into a unit for measuring information — the bit, the answer to a single yes-or-no question. Every text, photo, song, and video is, deep down, just a pile of bits.
Then he proved two remarkable things. First, there's a hard floor on how much you can shrink a message without losing anything — its true information content, and no smaller. Second, and more amazing: every line has a top honest speed, and as long as you stay under it, clever extra coding can make the message arrive with essentially zero errors, even through noise. That's why a scratched disc still plays and a photo from a distant spacecraft arrives crystal clear.
Where it came from
In 1948, Claude Shannon — a 32-year-old engineer-mathematician at Bell Labs — published a long paper in the company's technical journal. Working amid the practical problems of telephone and telegraph lines, he produced something far larger: a complete mathematical theory of communication, arriving almost fully formed. It even introduced the word “bit” (binary digit), a term he credited to his colleague John Tukey. The paper founded an entire field overnight.
Why it mattered
Before Shannon, “information” was a vague idea; after him, it was a measurable quantity with hard limits. He gave engineers two North Stars — the smallest a message can be squeezed (its entropy) and the fastest a channel can carry it (its capacity) — and proved that reliable communication through noise is always possible below that limit. Every digital device since has been chasing those two numbers.
Surprise is information
Imagine a friend texts you the weather. In a desert where it's sunny every single day, “sunny” tells you nothing — you already knew. But “snow!” would be a shock, packed with information. Shannon made this precise: the rarer and more surprising a message, the more bits it carries; the more predictable, the fewer. Shift the odds yourself below and watch the information meter respond.
Where you meet it
You rely on Shannon's theory dozens of times a day without noticing. It's in every file you compress, every photo and song squeezed smaller, every call or video that holds together over a weak signal, every QR code that still scans when smudged. The same ideas now power parts of machine learning and cryptography too — anywhere uncertainty needs measuring.
The fundamental problem of communication
The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.
Choice, and the unit called the bit
Entropy — H = −Σ pᵢ log pᵢ
Quantities of the form H = − Σ pᵢ log pᵢ play a central role in information theory as measures of information, choice and uncertainty.
The fundamental theorem for a noisy channel
Theorem: Let a source have entropy H bits per symbol and a channel have a capacity C bits per second. Then it is possible to transmit at the average rate C/H symbols per second over the channel with arbitrarily small frequency of errors.