Turning the process on its side: from counts to gaps
Guide 1 of this rung built the Poisson process three equivalent ways, and one of them was the counting view: N(t) is the number of events by time t, it has independent increments, and over any window of length s the count is Poisson with mean lambda·s, where lambda is the rate. That description watches the *vertical* axis — how high the staircase of counts has climbed. This guide turns the picture on its side and watches the *horizontal* axis instead: not how many events have happened, but how long we wait between them.
Picture a quiet help desk where calls arrive at random. Mark each call as a dot on a timeline. The first call lands at time T_1; the second at T_2; and so on. The numbers we will care about are the *gaps* between consecutive dots: X_1 = T_1 (the wait until the first call from time zero), X_2 = T_2 - T_1, X_3 = T_3 - T_2, and in general X_n = T_n - T_{n-1}. These gaps are the interarrival times, and they are random variables in their own right. The claim of this guide — the headline result — is that for a Poisson process every interarrival time is exponentially distributed with the same rate lambda, and the gaps are independent of one another.
Why the first gap is exponential
Start with the wait until the very first event, X_1. The trick is to ask not for its density directly but for its survival probability P(X_1 > t): the chance you are still waiting at time t. But "still waiting at time t" means exactly one thing in count language — *zero events have happened in [0, t]*. We already know that count is Poisson with mean lambda·t, and a Poisson variable equals 0 with probability e^(-lambda·t). So P(X_1 > t) = e^(-lambda·t). Subtracting from 1 gives the cdf F(t) = 1 - e^(-lambda·t), and differentiating gives the density f(t) = lambda·e^(-lambda·t) for t >= 0. That is precisely the exponential distribution with rate lambda.
P(X_1 > t) = P(no events in [0, t])
= P(N(t) = 0)
= e^(-lambda*t) (Poisson, mean lambda*t, at 0)
=> F(t) = 1 - e^(-lambda*t) (the cdf)
=> f(t) = lambda * e^(-lambda*t) (the density, t >= 0)
E[X_1] = 1 / lambda (mean gap)
Var(X_1) = 1 / lambda^2The mean gap E[X_1] = 1/lambda is a sanity check worth pausing on. If calls arrive at lambda = 3 per hour, the average wait between calls is 1/3 of an hour, i.e. 20 minutes — exactly what you'd guess by sharing one hour among three calls. Notice also that the spread equals the mean here: the standard deviation is also 1/lambda, since Var = 1/lambda^2. Exponential waits are surprisingly variable; many short gaps are punctuated by the occasional long one, which is the visual signature of true randomness in time.
Memorylessness: the wait that forgets
The exponential carries a property no other continuous distribution on [0, infinity) has: it is memoryless. In symbols, P(X > s + t given X > s) = P(X > t) for all s, t >= 0. Read it out loud: given that you have already waited s minutes with nothing happening, the chance you wait at least t more is the same as if you had just walked up. The clock does not 'warm up'. Plug in the survival function to check it falls out instantly: P(X > s + t and X > s) / P(X > s) = e^(-lambda(s+t)) / e^(-lambda·s) = e^(-lambda·t), which is P(X > t).
This is the continuous-time twin of a fact you already met among the discrete distributions: the geometric distribution is the only memoryless distribution on the positive integers, and its memorylessness says past failures don't make the next success any more 'due'. Indeed if you imagine the Poisson process as the limit of many tiny coin-flip time-slots, each with a small success chance, the geometric wait in slots becomes the exponential wait in continuous time. The memorylessness survives the limit; the exponential is simply the geometric's continuous shadow.
Independent gaps build the whole process
Memorylessness does more than describe one gap — it generates the entire process. Here is the chain of reasoning. Once the first event happens at T_1, the increment of the Poisson process after T_1 is, by independent increments, a fresh Poisson process that knows nothing of the past. So the wait until the *next* event, X_2, is again exponential(lambda), and independent of X_1. Repeat: every interarrival time X_1, X_2, X_3, ... is independent, and each is exponential with the same rate lambda. This gives a clean, constructive recipe — a third way to build a Poisson process to set beside the two from guide 1.
- Draw independent waits X_1, X_2, X_3, ... each from an exponential(lambda) distribution — for example with the inverse-transform trick, X = -(1/lambda)·ln(U) for a uniform U on (0,1).
- Place the arrival times by cumulative sums: T_1 = X_1, T_2 = X_1 + X_2, T_3 = X_1 + X_2 + X_3, and so on.
- Define the count N(t) as the number of these T_n that are <= t — the staircase that jumps up by one at each arrival.
- The N(t) you just built provably has independent increments and Poisson(lambda·t) counts: it is a genuine Poisson process, assembled purely from exponential gaps.
This construction is also the standard way to *simulate* a Poisson process on a computer, and it makes a subtle modeling point honest. The gaps being independent and identically exponential is a strong assumption: it says arrivals neither cluster (one event making another more likely soon) nor space themselves out (a refractory pause after each event). Real arrivals often do one or the other, in which case the data will not look exponential, and a richer model is needed. The Poisson process is the clean baseline of 'no structure', not a universal law.
Adding gaps: the Erlang and gamma arrival times
Now ask a slightly bigger question: how long until the n-th event arrives? That is the arrival time T_n = X_1 + X_2 + ... + X_n, a sum of n independent exponential(lambda) waits. A sum of independent things is a new distribution, and this particular one has a name: T_n follows the Erlang distribution with shape n and rate lambda. Its density is f(t) = lambda^n · t^(n-1) · e^(-lambda·t) / (n-1)! for t >= 0. The Erlang is just the integer-shape member of the gamma family; allowing a non-integer shape gives the full gamma, but waiting for a whole number of events keeps us in the Erlang.
Two checks make this trustworthy. By linearity of expectation the mean wait for n events is E[T_n] = n·E[X_1] = n/lambda — n gaps of average length 1/lambda each, exactly as it should be. And there is a beautiful bridge back to counts: the event {T_n <= t}, 'the n-th arrival has happened by time t', is the very same event as {N(t) >= n}, 'at least n events by time t'. That single identity links the Erlang cdf to a sum of Poisson probabilities, and it is the count–gap dictionary from the first section put to work.
A concrete number ties it together. With lambda = 3 calls per hour, the expected wait for the 5th call is 5/3 hours, about 100 minutes. The single-gap mean was 20 minutes; five of them average 100 minutes, and the variance also adds — Var(T_5) = 5/lambda^2 — because the gaps are independent. Each new layer of this rung will keep leaning on this same move: translate between counts and gaps, and use independence to add things up.