Cryptography 1979

Secrecy, Authentication, and Public Key Systems

Ralph C. Merkle

A single fingerprint that vouches for a whole tree of data — and catches any tamper.

Choose your version

In depth · the introduction

A way to take a whole library of data, boil it down to one short fingerprint — and still prove any single page is really inside.

The big idea

Suppose you want to vouch for a million records without handing anyone all million. Merkle's trick: hash each record into a short fingerprint, then hash fingerprints together in pairs, and the pairs of pairs, and so on, until everything funnels into a single fingerprint at the top — the root. That one value silently depends on every record below it.

Now anyone who knows the root can check a single record by following just the path from that record up to the top, collecting one “partner” fingerprint at each step. A handful of fingerprints — not the whole pile — is enough to prove the record belongs and hasn't been touched. Change even one character anywhere, and the root no longer matches.

How it came about

In the mid-1970s Ralph Merkle was an undergraduate at Berkeley when he proposed a then-heretical idea: two people could agree on a secret over a line everyone else could hear. A journal rejected the paper as too far outside the mainstream. Around the same time Whitfield Diffie and Martin Hellman were reaching related conclusions, and public-key cryptography was born of all three.

Merkle carried the work into his Stanford PhD, finished in 1979. There, to make digital signatures practical, he invented the hash tree that now bears his name — a quiet piece of plumbing that would wait three decades for its moment.

Why it mattered

Trust on a network usually means trusting a big institution to keep the records straight. The Merkle tree offers a different kind of trust: you keep one tiny fingerprint, and the math lets you check any piece of a vast dataset against it — cheaply, even if you got the data from someone you don't trust. That single move, verifying a part without re-downloading the whole, is what lets a phone check Bitcoin payments, lets Git tell whether a file changed, and lets browsers catch forged security certificates.

A way to picture it

Think of a knockout tournament bracket, but for fingerprints. Each record plays in the first round and earns a fingerprint; winners' fingerprints are combined into the next round's, and so on, up to a single champion at the top. To prove your record made it into the tournament you don't replay every match — you just show the results along your own branch up to the champion. If anyone had swapped your record, that trail of results would no longer lead to the same champion.

Where it sits

Merkle trees are one leg of the cryptography that quietly runs the internet, alongside the Diffie–Hellman key exchange and RSA elsewhere in this Library. For thirty years the hash tree was a quietly useful tool; then Satoshi Nakamoto's Bitcoin (also here) put a Merkle root in the header of every block, and overnight Merkle's 1979 construction was being computed billions of times a day. It is a clean case of basic research waiting decades for its moment.

The original document

Original source text

Ralph C. Merkle · “Secrecy, Authentication, and Public Key Systems” · Ph.D. dissertation, Stanford University · 1979

The problem

Merkle's thesis gathers the work that, alongside Diffie and Hellman, opened public-key cryptography: how two strangers can agree on a shared secret over a channel an eavesdropper can hear, and how a message can be signed so that anyone may verify it yet no one can forge it.

One-time signatures

One building block is the Lamport–Diffie one-time signature, which signs a single message by revealing pre-images of a one-way function. It is secure but wasteful: each public key can be used only once, so signing many messages means publishing — and somehow trusting — a long list of public values.

Tree authentication

Merkle's remedy is to place the many public values at the leaves of a binary tree and label every internal node with the hash of its two children. The single value at the top — the root — then commits to all the leaves at once. To prove one leaf is genuine, you reveal only that leaf together with the sibling hash at each level on the way up: about log₂n values, however many leaves there are. Recomputing the hashes along that path must reproduce the published root, or the leaf is rejected.

This is the structure now called a Merkle tree (or hash tree). It turns the authentication of arbitrarily many items into the storage of one root hash plus a short, checkable path per item.

[ … ]

The dissertation develops these ideas in full — one-way functions, the puzzle method for public-key distribution, the one-time signature, and the tree of one-time signatures — across its chapters; the complete text is at the source below.

Ralph C. Merkle · Stanford University · 1979