Martingales: The Mathematics of a Fair Game

A fair game, made exact

Picture a gambler who, on each round, bets one dollar on a fair coin: heads she gains a dollar, tails she loses one. Let M_n be her total fortune after n rounds. What is her expected fortune after the next round, given everything that has happened so far? The coin is fair, so on average she neither gains nor loses: her expected fortune tomorrow equals her fortune today. That sentence — best guess for tomorrow equals today's value — is the entire idea of a martingale, and this rung is devoted to squeezing astonishing consequences out of it.

You arrive at this rung already carrying two tools that make the definition possible. The first is the stochastic process: a whole sequence of random variables M_0, M_1, M_2, ... indexed by time, exactly the fortune-over-time object above. The second, and the more important here, is conditional expectation — the machinery for asking "what is the average of this random variable, given everything we currently know?" A martingale is simply a process whose conditional expectation of the next value, given the past, lands back on the present value. No new objects; a single elegant constraint linking two you already own.

What "the past" means: filtrations

To say "given everything that has happened so far" precisely, we need a bookkeeping device for accumulating information. That device is the [[prob-filtration|filtration]], which you met in the stochastic-processes rung. Think of F_n as the folder of everything observable up to and including time n — every coin flip, every fortune, every quantity you could in principle have written down by step n. As time advances the folder only grows: F_0 is contained in F_1 is contained in F_2, and so on. Information is never thrown away. Each filtration is technically a sigma-algebra, but the intuition you need is just "the history known by time n."

Two requirements bolt the process to its filtration. First, M_n must be adapted: by time n the value M_n is known — it lives inside the folder F_n, with no peeking ahead. Your fortune after round n is certainly observable by round n. Second, each M_n must be integrable, meaning E[|M_n|] is finite, so that the conditional expectations we are about to take genuinely exist. With these in place, the martingale condition is stated relative to the filtration, which is why the formal object is called a [[martingale-relative-to-filtration|martingale relative to a filtration]] — a process and its information stream are a package deal.

The defining equation, and its two cousins

Here is the heart of it. A process M_n, adapted to a filtration F_n and integrable, is a martingale when the conditional expectation of tomorrow given today's information equals today's value. Two equivalent ways to write it appear below: the second, in terms of the increment M_(n+1) - M_n, says the expected *change* is zero — the precise meaning of "on average you neither gain nor lose."

Martingale       :  E[ M_(n+1) given F_n ]  =  M_n        (fair)
  equivalently   :  E[ M_(n+1) - M_n  given F_n ]  =  0

Submartingale    :  E[ M_(n+1) given F_n ]  >=  M_n       (drifts up)
Supermartingale  :  E[ M_(n+1) given F_n ]  <=  M_n       (drifts down)

The three conditions side by side. Replace = with >= and you get a process that, on average, climbs; replace it with <= and you get one that, on average, sinks.

Loosen the equals sign and you meet the two cousins, jointly the submartingale and supermartingale. A submartingale has E[M_(n+1) given F_n] >= M_n: on average it drifts upward — a favourable game for the gambler. A supermartingale has E[M_(n+1) given F_n] <= M_n: on average it drifts downward — an unfavourable game. The naming is famously back-to-front, and it trips up everyone at first: "super" sinks and "sub" rises. A reliable mnemonic is that a supermartingale's *future expectation* sits at or below where you are now, so the value is super-iorly placed today — it has nowhere to go but level or down.

Canonical examples you can hold in your hand

The cleanest example is a running sum of fair, independent bets. Let X_1, X_2, ... be independent steps each with mean zero — say +1 or -1 on a fair coin — and set M_n = X_1 + ... + X_n with M_0 = 0. Given the history F_n, the next value is M_(n+1) = M_n + X_(n+1); since X_(n+1) is independent of the past and averages zero, E[M_(n+1) given F_n] = M_n + 0 = M_n. This is the [[sum-of-mean-zero-martingale|sum of independent mean-zero variables]], and it is the prototype: the symmetric simple random walk is a martingale. If instead each step has positive mean (a biased coin favouring you), the same calculation gives a submartingale; negative mean gives a supermartingale.

A second flavour is multiplicative rather than additive. Suppose your wealth is multiplied each round by an independent factor R_(n+1) with E[R_(n+1)] = 1 — for instance, double-or-nothing on a fair coin, where R is 2 or 0 with equal chance, averaging 1. Then W_n = W_0 R_1 R_2 ... R_n satisfies E[W_(n+1) given F_n] = W_n E[R_(n+1)] = W_n, so it too is a martingale. This product martingale is the natural model for compounding gains, and it shows that the fair-game idea reaches far beyond simple sums.

The third example feels like cheating but is the secret engine of the whole subject. Fix any single random variable Y with E[|Y|] finite, and define M_n = E[Y given F_n]: your best current estimate of Y as information accumulates. This is a [[doob-martingale|Doob martingale]], and it is automatically a martingale by the tower property of conditional expectation: estimating Y from F_n, then re-estimating from the larger F_(n+1) and averaging back over F_n, returns the F_n estimate. Concretely, a poll's projected vote share, updated each day as returns trickle in, is exactly such a process. We unpack this next.

Why the tower property makes it work

The Doob martingale deserves a slow look because it reveals where the martingale property really comes from. The tower property says that estimating with finer information and then coarsening back equals estimating coarsely from the start: for nested folders F_n inside F_(n+1), E[ E[Y given F_(n+1)] given F_n ] = E[Y given F_n]. Read the inner expectation as M_(n+1) and the outer as conditioning on F_n, and the equation literally becomes E[M_(n+1) given F_n] = M_n — the martingale condition, handed to you for free.

Pick the target you want to learn: a fixed random variable Y with E[|Y|] finite (the final vote count, say).
At each time n, form your best estimate given what you know: M_n = E[Y given F_n].
Apply the tower property: re-estimating from F_(n+1) and averaging back over F_n returns the F_n estimate, so E[M_(n+1) given F_n] = M_n.
Conclude M_n is a martingale: your sequence of best guesses for a fixed quantity is a fair game, with no systematic drift up or down.

There is a vivid moral here. Your forecast of a fixed unknown should itself be a martingale: today's best guess must equal the expected value of tomorrow's best guess. If you could predict that your estimate will rise tomorrow, you should simply revise it upward today — the predictability is information you already had. A forecast that is not a martingale is leaving information on the table. This is the same intuition that, in the next guide, becomes the formal statement that no betting strategy can beat a fair game.

Two consequences, and the road ahead

Two immediate consequences are worth banking before we move on. First, a martingale has constant expectation: taking the unconditional expectation of E[M_(n+1) given F_n] = M_n and using the law of total expectation gives E[M_(n+1)] = E[M_n], so E[M_n] = E[M_0] for every n. A fair game keeps your average fortune pinned at its starting value forever — fairness in one number. Second, beware the gambler's-fallacy reading: constant *average* fortune does not mean your actual fortune stays near M_0. Individual paths can wander enormously; the symmetric random walk drifts arbitrarily far from zero even though its mean is always zero.

It is also worth being honest about what a martingale is not. It is not a Markov process: the Doob martingale's next step can depend on the whole history through F_n, not just the present value, so the filtration is genuinely doing work that a single current state could not. And the equals sign is an assumption about the *real* world model, not a magic wand — calling a process a martingale requires actually checking E[M_(n+1) given F_n] = M_n, increment by increment, against how the system behaves.

With the definition and its examples in hand, the rest of the rung is a tour of what fairness forces. Guide 2 proves you cannot tilt a fair game in your favour by clever betting — the martingale transform and the no-strategy theorem. Guide 3 introduces stopping times, rules for quitting that may not peek into the future, and the optional stopping theorem that says fairness usually survives even a cleverly chosen quitting time. Guide 4 cashes that out on the classic gambler's-ruin problem. Guide 5 closes with the maximal inequalities and the convergence theorem, which guarantees that a fair game which cannot drift to infinity must settle down to a limit. Everything ahead is a consequence of the one equation you now hold.