Brownian Motion: The Random Walk's Continuous Limit

From a jagged staircase to a continuous curve

You already know the simple random walk: start at 0, and every second add +1 for heads or -1 for tails. Its path is a jagged staircase that jumps by a whole unit each tick. Brownian motion is what you get when you refuse to accept that the clock has to tick once per second and the step has to be a whole unit. Suppose instead you take a step every tiny interval of time of length dt, and make each step tiny too. The question is: shrink them how? Take steps too big and the walk flies off to infinity in any finite time; take them too small and it freezes and never moves. There is exactly one balance that gives something alive.

The right rule is to make the step size proportional not to dt but to the square root of dt — each step is roughly of size sqrt(dt). Why the square root? Here is the one idea that powers everything that follows. Over a fixed amount of time T you take about N = T / dt independent steps, each of variance about dt. Because the steps are independent, their variances add up, not their standard deviations. So the variance of the total after time T is about N times dt, which equals (T / dt) times dt, which equals T. The dt cancels perfectly. The total displacement therefore has variance T no matter how fine you make the time grid — finite and stable, neither exploding nor vanishing.

The definition: what makes a curve Brownian

Rather than describe the limiting procedure forever, mathematicians simply list the properties the limit has and call any process with them a Brownian motion. A process { B_t : t >= 0 } is a (standard) Brownian motion if four things hold. First, B_0 = 0 — it starts at the origin. Second, it has independent increments: for disjoint time stretches, the changes are independent random variables. Third, it has stationary, normally distributed increments: for any s < t, the increment B_t - B_s follows Normal(0, t - s) — mean zero, and variance equal to the elapsed time. Fourth, its sample paths are continuous: you can draw each one without lifting the pen.

A standard Brownian motion { B_t , t >= 0 } satisfies:

  (1) B_0 = 0                         start at the origin
  (2) independent increments          disjoint stretches are independent
  (3) B_t - B_s ~ Normal(0, t - s)    for s < t   (stationary, Gaussian)
  (4) t -> B_t is continuous          paths drawn without lifting the pen

Consequences read off (3):
  E[B_t]   = 0
  Var(B_t) = t           (so standard deviation = sqrt(t))
  Cov(B_s, B_t) = min(s, t)

The four defining properties, with the mean, variance, and covariance that drop straight out of them.

Two quick consequences make the definition concrete. Since B_t = B_t - B_0 is itself an increment over [0, t], we get B_t ~ Normal(0, t) directly: E[B_t] = 0 and Var(B_t) = t, matching the scaling we just argued. And for the covariance, write the later value B_t as B_s plus the disjoint increment B_t - B_s; that increment is independent of B_s and has mean zero, so Cov(B_s, B_t) = Var(B_s) = s when s < t — that is, Cov(B_s, B_t) = min(s, t). These are not extra assumptions; they are forced by the four rules. Notice too that nothing here required the random-walk limit at all — that limit is one honest construction, but it is the four properties that define the object.

Why the limit really works: Donsker's theorem

It is one thing to wave hands about shrinking steps; it is another to prove the staircase converges to something genuine. The guarantee is Donsker's theorem, often called the functional central limit theorem or the invariance principle. Take the simple random walk S_1, S_2, ..., S_n, rescale space by dividing by sqrt(n) and rescale time by dividing by n, and linearly connect the dots into a continuous curve. Donsker's theorem says that as n grows, this rescaled walk converges — as a whole random curve, not just at one instant — to standard Brownian motion on [0, 1].

Read Donsker's theorem as the ordinary central limit theorem grown up. The plain CLT says one snapshot of the walk — the rescaled value at the final time — is approximately normal. Donsker upgrades "one snapshot is normal" to "the entire path is approximately a Brownian path". The word invariance captures the punchline: the limit does not care whether your steps were coin flips, +-1, +-7, or any other jumps with mean 0 and finite variance — they all wash out to the same Brownian motion. This is why Brownian motion shows up everywhere, just as the bell curve does: it is the universal limit of accumulated small independent shocks.

A Gaussian process you can compute with

There is a second, very clean way to see Brownian motion: as a Gaussian process. A Gaussian process is one whose finite-dimensional distributions are all multivariate normal — pick any finite set of times t_1, ..., t_k and the vector (B_{t_1}, ..., B_{t_k}) is jointly normal. A multivariate normal is pinned down entirely by its mean function and its covariance function, and we just computed both: mean E[B_t] = 0 everywhere, and covariance Cov(B_s, B_t) = min(s, t). So Brownian motion is the mean-zero Gaussian process with covariance min(s, t). That single line is, remarkably, a complete description.

Let us use it on a tiny number. What is the chance B_4 lands above 3? Since B_4 ~ Normal(0, 4), its standard deviation is sqrt(4) = 2, so a value of 3 is 3 / 2 = 1.5 standard deviations above the mean. From a standard normal table, P(B_4 > 3) = P(Z > 1.5) is about 0.067, roughly a 1-in-15 chance. Notice the whole computation reduced to a z-score and the normal you already know — that is the payoff of property (3). One honest caution carried over from the random-variable rung: a number like the density of B_4 at the point 3 is not a probability, and any single exact value such as P(B_4 = 3) is exactly 0; only intervals carry positive probability for a continuous variable.

Markov and martingale: the two structural gifts

Brownian motion inherits the two structural patterns that made the random walk tractable, now in continuous time. First, the Markov property: given the present value B_t, the future of the process is conditionally independent of its entire past. Because increments after time t are independent of everything up to t, the path that brought the particle to its current spot tells you nothing extra about where it goes next — only its present position matters. This is the same forgetting you saw in chains, simply transplanted to a continuous clock and a continuous state space.

Second, Brownian motion is a martingale: it models a perfectly fair game. The martingale condition is that the best forecast of a future value, given everything known so far, is simply today's value: E[B_t given the past up to time s] = B_s for s < t. This follows in one line — the future increment B_t - B_s has mean zero and is independent of the past, so it adds nothing in expectation. A martingale does not predict that the path will drift back toward zero or anywhere else; it predicts no systematic drift at all, in either direction. That is the precise mathematical content of "fair", and it is the foundation the next rung's optional-stopping and pricing arguments will stand on.

Where this rung is heading

We now have a continuous random curve that is Markov, a martingale, and Gaussian — a beautifully structured object. But the same square-root scaling that kept it alive has a sting in its tail. Over a short interval dt the typical wiggle is about sqrt(dt), which is far larger than dt itself when dt is small. Pile up these oversized wiggles and the path turns out to be continuous everywhere yet differentiable nowhere: it has no well-defined slope at any single instant. The next guide examines this strange geometry of Brownian paths directly.

That non-differentiability is exactly why ordinary calculus cannot integrate against a Brownian path, and it is the problem the rest of the rung solves. The cumulative roughness is measured by the quadratic variation, which for Brownian motion equals t rather than zero — the signature that the usual rules of calculus must change. Out of that fact grow the Ito integral and Ito's lemma, the corrected chain rule of stochastic calculus, and finally stochastic differential equations: geometric Brownian motion as a model of a stock price, and the Black-Scholes equation it produces. Today's four properties are the seed of all of it.