Stopping Times and Optional Stopping

From a fixed clock to a chosen moment

The previous two guides set the stage. A martingale M_0, M_1, M_2, ... is a fair game: given everything you have seen up to time n, your expected fortune next step is exactly what you hold now, E[M_(n+1) given the past] = M_n. And the martingale transform guide proved that betting more or less on each round — any strategy that uses only past information — leaves the game fair: E[M_n] = E[M_0] for every fixed time n. But there was one knob we never turned. We always evaluated at a *fixed* clock time n. Real gamblers do not stop at a predetermined ring of the bell; they stop when something happens — when they double their money, when they hit zero, when they get bored. This guide is about stopping at a chosen moment, and whether 'fair at every fixed time' survives the upgrade to 'fair at the moment you actually walk away'.

To even ask the question precisely, we need to pin down what counts as a *legal* stopping rule. The honest constraint is causality: at any moment you must be able to decide 'stop now or keep going' using only what has already happened, never a peek at the future. 'Cash out the first time my fortune reaches 100' is legal — you know your fortune as it unfolds. 'Cash out one step before my biggest peak' is illegal — you would need tomorrow's news today. The machinery from the filtration guide makes this exact. Recall that the filtration F_0 ⊆ F_1 ⊆ F_2 ⊆ ... is the growing record of information: F_n is everything observable by time n. A legal rule is one whose decision at time n depends only on F_n.

What a stopping time really is

A stopping time T is a random time — itself a random variable, taking values 0, 1, 2, ... (or possibly infinity) — with one defining property: for every n, the event {T = n} is decidable from F_n. In words, the instant the bell rings, you can tell that it just rang, using only information available at that instant. You never have to wait for later evidence to confirm that the moment of stopping has arrived. That single rule is the whole definition of a stopping time, and it is exactly the causality constraint dressed in measure-theoretic clothes.

The cleanest examples are hitting times. Let T be the first time the process enters some set A — say, the first n with M_n >= 100. Is {T = n} decidable from F_n? Yes: T = n means M_0, ..., M_(n-1) all stayed below 100 and M_n finally reached it, and every one of those facts is known by time n. So 'first time I hit my target', 'first time I go broke', 'first time the random walk reaches +5' are all bona fide stopping times. Now contrast a last-exit time: 'the last time the walk visits 0 before drifting away forever'. To know that a given visit to 0 was the *last* one, you must know the entire future — the walk never returns. That is not decidable from F_n, so it is not a stopping time. The litmus test is always the same: could a person living forward in time, with no crystal ball, recognize this moment when it arrives?

The optional stopping theorem

Now the payoff. Stopping a process M at a stopping time T gives a new random number M_T: your fortune at the moment you actually walk away. The natural hope, given that the game is fair at every fixed time, is that it is fair at this self-chosen time too: E[M_T] = E[M_0]. The optional stopping theorem (also called the optional sampling theorem) says exactly this — *under conditions*. For a martingale M and a stopping time T, E[M_T] = E[M_0], provided one extra safeguard holds to rule out pathological never-ending strategies. This is the crown jewel of the rung: it formalizes the intuition that you cannot, by cleverly choosing *when* to quit, squeeze a profit out of a fair game.

Why is the safeguard necessary? Because without it the theorem is simply false, and the counterexample is the famous doubling strategy from the transform guide. Bet 1 on a fair coin; if you lose, double to 2; lose again, double to 4; keep doubling until your first win, then stop. Define T as that first winning round. When you finally win you recoup all past losses plus 1, so M_T = M_0 + 1 with probability 1, giving E[M_T] = E[M_0] + 1 — a guaranteed dollar out of a fair game! The theorem is not broken; its hypotheses are. This T is finite with probability 1, but the *interim debts* before you win are unbounded — you might owe 2^k, an arbitrarily huge sum, and the expected size of your bankroll along the way blows up. The safeguard exists precisely to forbid this kind of bet-the-farm martingale.

So what are the safeguards? Any one of these three sufficient conditions is enough to license E[M_T] = E[M_0]. (1) T is bounded: there is a fixed number N with T <= N always, so the game is forced to end by some deadline. (2) T is finite with probability 1 and the process is bounded up to time T — there is a constant c with |M_n| <= c for all n <= T. (3) E[T] is finite and the step sizes are bounded — the increments |M_(n+1) - M_n| stay below some constant. Each clause closes the loophole the doubling strategy exploited (unbounded debts, unbounded waits with unbounded stakes). The deepest, most flexible version replaces all of these with a single condition called uniform integrability, which controls the tail mass of the whole family {M_(T∧n)} at once; but the three concrete clauses cover almost every problem you will meet.

OPTIONAL STOPPING THEOREM (discrete time)

  M = martingale,  T = stopping time   =>   E[M_T] = E[M_0]

  ...PROVIDED at least one safeguard holds:

    (1) T <= N   for a fixed deadline N            (bounded time)
    (2) T < infinity a.s. AND |M_n| <= c, n <= T   (bounded process)
    (3) E[T] < infinity AND |M_(n+1)-M_n| <= c     (bounded steps)

  Submartingale:  E[M_T] >= E[M_0]   (favorable game, >= )
  Supermartingale: E[M_T] <= E[M_0]  (unfavorable game, <= )

The theorem and its three everyday safeguards, plus the directional versions for sub- and supermartingales. The symbol T∧n means the smaller of T and n; 'a.s.' means with probability 1.

Why it works: the stopped process is still a martingale

The proof is shorter and more elegant than the statement suggests, and it reuses everything from the transform guide. The key move is to build the stopped process, written M_(T∧n): freeze the value the instant T fires and hold it constant forever after. Concretely, M_(T∧n) equals M_n while n < T, and equals M_T once n >= T. Picture a gambler who plays the real game until the bell rings, then sits on his exact pile of chips for every remaining round, betting zero.

Here is the punchline: 'bet 1 until T, then bet 0' is itself a legal betting strategy — a predictable strategy, because the decision to keep playing on round n is exactly the event {T >= n}, and that event is decidable from F_(n-1) (you know whether the bell has already rung). The transform guide proved that any predictable bet applied to a martingale yields another martingale. Therefore the stopped process M_(T∧n) is a martingale all on its own. Martingales are fair at every fixed time, so E[M_(T∧n)] = E[M_0] for every n. That already says: no matter how long you watch, the stopped game's expected value never budges from its start. We are most of the way home.

The last step is taking n to infinity. We want E[M_(T∧n)] -> E[M_T], because M_(T∧n) -> M_T whenever T is finite. But — and this is the whole subtlety — convergence of the *random variables* does not automatically give convergence of their *expectations*; you can lose mass off to infinity in the limit, which is precisely the leak the doubling strategy exploits. The three safeguards are exactly the hypotheses that license swapping the limit and the expectation (via bounded or dominated convergence, tools from the measure-theory rung). Under any one of them, E[M_T] = lim E[M_(T∧n)] = E[M_0], and the theorem is proved.

Reading the inequalities, and a first taste of the payoff

The theorem also has directional cousins that are just as useful. If M is a submartingale — a favorable game where E[M_(n+1) given the past] >= M_n, the tide gently in your favor — then stopping can only help or hold even: E[M_T] >= E[M_0]. If M is a supermartingale — an unfavorable game with >= flipped to <= , like a real casino where the house edge pulls you down — then E[M_T] <= E[M_0], and no stopping rule can rescue you. The mnemonic is clean: sub goes *up*, super goes *down*, and the equality of the martingale is the knife-edge between them. These inequalities need the same finiteness safeguards as the equality.

Here is a tiny calculation to see the machine turn. Let S_n be a symmetric simple random walk starting at S_0 = 0: each step is +1 or -1 with probability 1/2, a fair game, hence a martingale by linearity of expectation. Put two walls at -a and +b (positive integers) and let T be the first time the walk touches either wall. T is bounded in expectation and the walk between the walls is bounded by max(a, b), so safeguard (2)/(3) applies and optional stopping holds: E[S_T] = E[S_0] = 0. But S_T is either -a (with probability p of hitting the left wall) or +b (with probability 1 - p). So 0 = (-a)·p + b·(1 - p), which rearranges to p = b / (a + b). One line of algebra, and we have read off the probability of going broke before reaching a target — the heart of the gambler's ruin problem.

That little victory is a preview: the very next guide does the gambler's ruin problem in full, including the *expected duration* of the game, by squeezing the same trick out of a second, cleverly chosen martingale. The lesson to carry forward is the method, not just the answer. Whenever you can spot a martingale and a legal stopping time, optional stopping converts an awkward question about a whole random process into a single, tidy equation E[something at the end] = E[something at the start] — and the bounded-debt safeguard is the price of admission you must always check before you cash the cheque.