JOVANA
Library Glossary Getting Started Three Levels Fields How it works Mission
Join the mission
Back to the library
Information / CS 1948

A Mathematical Theory of Communication

Claude Shannon

Information can be measured in bits — and sent flawlessly, even through a noisy channel.

Choose your version
In depth · the introduction

Shannon found a way to measure information itself — in bits — and proved you can send a message perfectly even down a noisy, crackling line.

The idea, unpacked

Shannon's first insight sounds strange: to handle communication with mathematics, ignore what a message means. What counts is how surprising it is. A symbol you could have guessed tells you almost nothing; a surprising one tells you a lot. He turned that into a unit for measuring information — the bit, the answer to a single yes-or-no question. Every text, photo, song, and video is, deep down, just a pile of bits.

Then he proved two remarkable things. First, there's a hard floor on how much you can shrink a message without losing anything — its true information content, and no smaller. Second, and more amazing: every line has a top honest speed, and as long as you stay under it, clever extra coding can make the message arrive with essentially zero errors, even through noise. That's why a scratched disc still plays and a photo from a distant spacecraft arrives crystal clear.

Where it came from

In 1948, Claude Shannon — a 32-year-old engineer-mathematician at Bell Labs — published a long paper in the company's technical journal. Working amid the practical problems of telephone and telegraph lines, he produced something far larger: a complete mathematical theory of communication, arriving almost fully formed. It even introduced the word “bit” (binary digit), a term he credited to his colleague John Tukey. The paper founded an entire field overnight.

Why it mattered

Before Shannon, “information” was a vague idea; after him, it was a measurable quantity with hard limits. He gave engineers two North Stars — the smallest a message can be squeezed (its entropy) and the fastest a channel can carry it (its capacity) — and proved that reliable communication through noise is always possible below that limit. Every digital device since has been chasing those two numbers.

Surprise is information

Imagine a friend texts you the weather. In a desert where it's sunny every single day, “sunny” tells you nothing — you already knew. But “snow!” would be a shock, packed with information. Shannon made this precise: the rarer and more surprising a message, the more bits it carries; the more predictable, the fewer. Shift the odds yourself below and watch the information meter respond.

An interactive entropy meter: four sliders set how likely each symbol is; bars show each probability and its surprisal (−log₂ p), and a readout gives the entropy H = −Σ pᵢ log₂ pᵢ in bits — highest (2 bits) when all four are equally likely, near zero when one dominates. The Expert panel adds the maximum log₂n, redundancy and efficiency.

Where you meet it

You rely on Shannon's theory dozens of times a day without noticing. It's in every file you compress, every photo and song squeezed smaller, every call or video that holds together over a weak signal, every QR code that still scans when smudged. The same ideas now power parts of machine learning and cryptography too — anywhere uncertainty needs measuring.

The original document
Original source text

The fundamental problem of communication

C. E. Shannon · The Bell System Technical Journal 27 (1948): 379–423, 623–656
The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.
Frequently the messages have meaning … These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design.

Choice, and the unit called the bit

If the number of messages in the set is finite then this number or any monotonic function of this number can be regarded as a measure of the information produced when one message is chosen from the set, all choices being equally likely. … The logarithmic measure is more convenient. … The resulting units may be called binary digits, or more briefly bits, a word suggested by J. W. Tukey.

Entropy — H = −Σ pᵢ log pᵢ

Quantities of the form H = − Σ pᵢ log pᵢ play a central role in information theory as measures of information, choice and uncertainty.
H is largest when the choices are most uncertain — all equally likely — and falls to zero when one outcome is certain. It is, in Shannon's words, a measure of how much “choice” is involved in the selection of a message.

The fundamental theorem for a noisy channel

Theorem: Let a source have entropy H bits per symbol and a channel have a capacity C bits per second. Then it is possible to transmit at the average rate C/H symbols per second over the channel with arbitrarily small frequency of errors.
Bell Telephone Laboratories · 1948