What we are trying to model
Picture the calls landing at a help desk, raindrops striking one paving slab, clicks of a Geiger counter, or the moments customers walk through a shop door. The events arrive one at a time, at random instants spread along a continuous timeline, with no inherent clock telling them when. We do not want to track *what* each event is yet — only *when* it happens. The object that records this is a counting process N(t): the number of events that have occurred up to and including time t. N(0) = 0, N(t) only ever goes up, and it jumps by exactly 1 each time an event lands.
Now add the modeling assumption that makes things clean: arrivals are *completely structureless in time*. Each instant is, in a precise sense, as good as any other for an arrival, and what happens in one stretch of time tells you nothing about another. This is the natural continuous-time cousin of the independent, stationary increments idea you met when we first defined stochastic processes. A counting process built on exactly this assumption is the Poisson process, and the single number controlling how briskly events come is the rate lambda, in events per unit time.
View 1 — the counting definition
The first view states outright how many events fall in any interval. We call N(t) a Poisson process of rate lambda if three things hold. (i) N(0) = 0 — we start the clock with nothing yet counted. (ii) The process has independent increments: for disjoint time intervals, the numbers of arrivals are independent of one another. (iii) The number of arrivals in any interval of length s has a Poisson distribution with mean lambda·s; that is, N(t + s) - N(t) ~ Poisson(lambda·s), depending only on the length s, not on where the interval sits.
Two consequences fall out immediately. Because the mean of a Poisson(lambda·s) variable is lambda·s, we get E[N(t)] = lambda·t — on average lambda events per unit time, which is exactly what calling lambda the "rate" should mean. And because a Poisson variable has variance equal to its mean, Var(N(t)) = lambda·t as well: a fingerprint of the Poisson process is that the count's variance equals its mean. The dependence on length s alone, not position, is the *stationary increments* property — the process has no preferred clock time.
View 2 — the infinitesimal definition
The second view never mentions the Poisson distribution at all. Instead it describes the *local* behavior of the process over a vanishingly short window of length h, again assuming independent and stationary increments. The recipe: in a tiny interval of width h, the chance of exactly one arrival is lambda·h plus terms negligible compared to h; the chance of two or more arrivals is itself negligible compared to h; and so the chance of no arrival is 1 - lambda·h, again up to negligible terms. In shorthand, P(one arrival in h) ~ lambda·h, P(two or more in h) ~ 0, P(none in h) ~ 1 - lambda·h.
The crucial third clause — that double arrivals are vanishingly rare — is the formal statement of orderliness: in a true Poisson process, events never coincide; they arrive one at a time. This view is the one physicists and queueing theorists reach for, because it reads like a law of nature: in each blink of time the system either does nothing or registers a single event, with a constant per-instant tendency lambda. From these local rates you can set up a differential equation for P(N(t) = k) and solve it; the solution that emerges is precisely the Poisson(lambda·t) law of View 1. Same process, derived bottom-up.
"Negligible compared to h" is the honest version of the hand-wave, and it is worth pinning down because it is so often skipped. The precise statement uses little-o notation: the error terms are o(h), meaning they shrink faster than h as h goes to 0, so dividing them by h sends them to zero. This is not a probability being literally zero — it is a probability that becomes irrelevant once the window is small enough. Skipping this caveat is the usual way the infinitesimal view gets taught sloppily.
View 3 — the interarrival definition
The third view builds the process by saying how long you wait between events rather than counting them. Let T_1 be the time until the first arrival, T_2 the gap from the first to the second, T_3 the gap from the second to the third, and so on; these are the interarrival times. The definition: a Poisson process of rate lambda is what you get when T_1, T_2, T_3, ... are independent random variables, each with an exponential distribution of rate lambda. Lay these gaps end to end and place a dot at each running total — those dots are the arrival times, and the count of dots up to time t is N(t).
Why exactly the exponential? Because it is the *only* continuous waiting-time law with the memoryless property, and memorylessness is precisely the local-structurelessness of View 2 seen from the waiting side: if no event has arrived in the last 30 seconds, your remaining wait has the same distribution as a fresh wait — the process does not "build up" toward a due arrival. The next guide in this rung is devoted to this exponential link, so we only flag it here. Notice too that the time until the n-th arrival is a sum of n independent exponentials, which gives the Erlang (Gamma) distribution — the bridge between waiting and counting.
View 1 COUNTING: N(t+s) - N(t) ~ Poisson(lambda*s),
independent over disjoint intervals
View 2 INFINITESIMAL: P(1 arrival in h) = lambda*h + o(h)
P(>=2 in h) = o(h)
P(0 in h) = 1 - lambda*h + o(h)
View 3 INTERARRIVAL: T_1, T_2, T_3, ... i.i.d. Exponential(lambda)
---- all three define the SAME process ----Why three views, and a worked number
The deep fact of this guide is a theorem: these three definitions are logically equivalent — any process satisfying one satisfies the other two. The proofs run both ways. From the interarrival view you can show counts are Poisson; from the infinitesimal rates you can derive both the counts and the exponential gaps; and so on around the circle. You do not have to follow every proof now; the payoff is practical. When a problem hands you a count ("how many calls in an hour?"), reach for View 1. When it hands you a wait ("how long until the next call?"), reach for View 3. When you must justify *why* a phenomenon is Poisson at all, argue from the local independence of View 2.
Let us make it concrete with calls arriving at lambda = 3 per hour. Using View 1, the number of calls in the next two hours is Poisson(3·2) = Poisson(6), so the chance of exactly five calls is P(N = 5) = e^(-6)·6^5 / 5! which works out to about 0.161. The expected number is E[N] = 6 and, as a Poisson, Var(N) = 6 too. Using View 3 on the same setup, the wait for the next call is Exponential(3), so the chance of waiting more than 20 minutes (one-third of an hour) is e^(-3·(1/3)) = e^(-1) ~ 0.368. Two questions, two views, one underlying process — and the answers are consistent by the equivalence theorem.
One honest caveat to carry forward. Real arrival streams are often *not* Poisson: rush hours break stationarity, calls come in correlated bursts breaking independence, and two events sometimes genuinely coincide breaking orderliness. The Poisson process is the clean baseline against which such structure is measured, not a universal truth. Later guides relax each assumption in turn — letting the rate vary in time, attaching sizes or marks to each event, spreading points over space rather than a line, and replacing exponential gaps with arbitrary ones — so the equivalence you have just seen is the foundation those generalizations are built upon.