One coin, one step: building the walk
In the last two guides you met the idea of a stochastic process — a whole family of random variables indexed by time — and you learned to read its sample path, the single zig-zag curve you actually observe when you let the randomness run once. The [[simple-random-walk|simple random walk]] is the cleanest possible example, and it is the one to hold in your mind for the rest of this rung. The recipe is almost insultingly simple: start at position 0, and at every tick of the clock flip a fair coin. Heads, step +1; tails, step -1. Your position after n steps is where you have ended up.
Let us write it down precisely so we can compute with it. Let X_1, X_2, X_3, ... be the individual steps, each one a coin flip that takes the value +1 or -1 with probability 1/2 each, all independent of one another. The position after n steps is just the running total S_n = X_1 + X_2 + ... + X_n, with S_0 = 0 by convention. The whole process is the sequence S_0, S_1, S_2, ... — a partial-sum process. Every property of the walk that follows is really a property of these accumulating sums.
step X_k : +1 with prob 1/2 (heads)
-1 with prob 1/2 (tails)
position after n steps:
S_0 = 0, S_n = X_1 + X_2 + ... + X_n
E[X_k] = 0, Var(X_k) = 1Why it has independent, stationary increments
In guide 2 you met two structural gifts a process can have: [[independent-stationary-increments|independent and stationary increments]]. The random walk has both, and seeing exactly why is the heart of understanding it. An increment is just the displacement over a stretch of time: from step m to step n it is S_n - S_m = X_(m+1) + ... + X_n. Because that sum involves only the coin flips strictly after time m, it shares no flips at all with an earlier increment like S_m - S_0. Disjoint groups of independent flips are independent of each other — so increments over non-overlapping time windows are independent. The future jump knows nothing about the past jump.
Stationary increments means the distribution of a jump depends only on how long the jump is, not on when it starts. The increment S_(m+k) - S_m is always a sum of exactly k fair +/-1 flips, whether m is 0 or 1000 — and since all the flips are identically distributed, that sum has the same distribution regardless of m. So a 10-step jump looks statistically identical whether it happens at the start or after a million steps. These two properties together are exactly what makes the walk a building block: long stretches can be glued from short, interchangeable, independent pieces.
Where does the walk go? Mean zero, spreading variance
Now for the two numbers that govern the walk's behaviour. Because each step is symmetric, E[X_k] = 0, and by linearity of expectation the position has E[S_n] = E[X_1] + ... + E[X_n] = 0 for every n. On average the walk goes nowhere — its expected position is forever pinned at the origin. That is a real and useful fact, but it is also a famous trap: it does not mean the walk stays near 0. The average over many imagined repetitions is 0; any single path can wander astonishingly far.
The spreading is captured by the variance. Each step has Var(X_k) = E[X_k^2] - (E[X_k])^2 = 1 - 0 = 1. Because the steps are independent, variances simply add: Var(S_n) = Var(X_1) + ... + Var(X_n) = n. So the standard deviation of the position is the square root of n. After 100 steps the walk is typically about sqrt(100) = 10 units from home; after 10000 steps, about 100 units. This sqrt(n) growth is the signature rhythm of diffusion — the walk's typical distance grows, but only as fast as the square root of time, far slower than the n steps it has taken.
There is a beautiful payoff hiding in that sum-of-independent-steps structure. Rescale the position by its standard deviation, forming S_n / sqrt(n). Since S_n is a sum of many independent, identically distributed pieces with finite variance, the central limit theorem applies: for large n the distribution of S_n / sqrt(n) approaches Normal(0, 1). So even though each step is a crude +/-1, the accumulated position smooths into the familiar bell curve. (The CLT needs that finite variance — here it equals 1, so we are safe; a walk with heavy-tailed steps and infinite variance would not converge to a normal at all.)
Memoryless: the walk forgets how it got here
Here is the property that makes the walk the gateway to the rest of this rung: it has the [[markov-property|Markov property]]. Suppose I tell you the walk is currently at position S_n = 3. To predict where it goes next, do you need to know the entire history — whether it crept up there slowly or rocketed there and fell back? No. The next step is just one more fresh, independent coin flip, so the future depends on the past only through the present value 3. The path that brought you to 3 is irrelevant; only the fact that you are standing at 3 matters.
Notice this is the increment structure paying off again. Because future steps are independent of past steps, conditioning on the whole past gives exactly the same prediction as conditioning on just the current position. In symbols, the conditional distribution of S_(n+1) given the entire history (S_0, ..., S_n) equals its distribution given S_n alone. This is the formal statement of "the future is independent of the past given the present", and it is the defining feature of a Markov chain — the subject of the very next rung. The humble random walk is your first, and friendliest, Markov chain.
What the walk opens up next
This one walk is a launchpad for almost everything ahead. Ask whether the walk is certain to return to 0, and you are asking about recurrence versus transience — the topic of guide 4. Remarkably, the answer depends on dimension: a one-dimensional or two-dimensional simple symmetric walk returns to its start with probability 1 (it is recurrent), but in three or more dimensions it has positive probability of wandering off forever (it is transient). "A drunk man will find his way home, but a drunk bird may be lost forever," as the saying goes.
Cap the walk between two walls — say, lose if you hit -a, win if you hit +b — and you have the classic gambler's ruin problem, also coming in guide 4: it tells a gambler with finite money facing an endless casino their exact probability of going broke. And zoom out instead of in: speed up the clock and shrink the step size in just the right sqrt-coupled ratio, and the jagged walk smooths into the continuous, nowhere-differentiable path of Brownian motion, the star of a later rung. The simple random walk is, quite literally, Brownian motion seen through a coarse pixel grid.
So although the recipe took one sentence, this walk quietly contains the increments, the diffusion scaling, the Markov property, and the seed of Brownian motion all at once. Guide 4 puts the walk to work on recurrence and gambler's ruin; guide 5 will dress it up with the formal language of filtrations and stopping times that lets us reason rigorously about "the first time the walk hits +5". Keep the picture of the coin-by-coin path in mind — almost every abstraction that follows is just this walk, viewed from a new angle.