Spatial and Marked Poisson Processes

Letting the rate breathe: the non-homogeneous process

Everything you have built in this rung so far rested on one quiet assumption: the rate lambda never changes. Buses come at exactly 4 per hour at 3am and at 3pm alike. Real life rarely obeys. A coffee shop is dead mid-afternoon and slammed at the morning rush; emergency-room visits surge on weekends; web traffic follows daylight around the globe. The [[non-homogeneous-poisson-process|non-homogeneous Poisson process]] keeps everything good about the Poisson process but lets the rate rise and fall, so the single number lambda becomes a function lambda(t) of time.

The right bookkeeping device is the area under the rate curve, called the mean function: m(t) = integral from 0 to t of lambda(u) du, the expected number of events accumulated by time t. The count in any interval (s, t] is then Poisson with mean m(t) - m(s), and counts in disjoint intervals are still independent. In one phrase: replace "lambda times length" with "area under the rate curve", and everything else carries over. If a shop's rate climbs linearly as lambda(t) = 2t customers per hour, then over the first 3 hours you expect the integral of 2t, which is t^2 evaluated at 3, namely 9 customers — not a flat rate's 6.

Events that carry weight: the compound process

Often you do not care how many events happened, but about the running total of something they each bring. An insurer cares about total claim money, not the count of claims; a website cares about total bytes served, not the number of requests; a casino cares about total winnings paid, not how many jackpots hit. The [[compound-poisson-process|compound Poisson process]] tracks exactly this: events arrive as a Poisson process, and each event drags along an independent random size, which you keep adding up.

Formally, let N(t) be a Poisson process of rate lambda, and let Y_1, Y_2, Y_3, ... be independent, identically distributed jump sizes, independent of N. The compound process is the random sum X(t) = Y_1 + Y_2 + ... + Y_(N(t)), with X(t) = 0 when no events have happened. Both the number of terms and the terms themselves are random — and that is what makes it richer than a fixed-length sum. The mean falls out beautifully by conditioning on the count: E[X(t)] = E[N(t)] * E[Y] = lambda*t*E[Y]. The variance follows from the law of total variance and is Var(X(t)) = lambda*t*E[Y^2].

A tag on every point: marked Poisson processes

Sometimes an event is not just a moment but a moment with extra information stapled to it. A car passing a sensor has a speed; an earthquake has a magnitude; a customer has a purchase amount; a falling raindrop has a size. A [[marked-poisson-process|marked Poisson process]] records both the arrival time and the mark, so every event becomes a pair (time, mark) — a point on the clock decorated with data. Start with a Poisson process of arrival times, and to each arrival independently attach a mark drawn from a fixed mark distribution. The marks can be numbers, categories, or whole vectors.

The defining requirement is that, given the times, the marks are independent of one another and of the timing. The payoff is the marking theorem: the collection of (time, mark) pairs is itself a Poisson process, now living on the larger space of times-crossed-with-marks. This single statement quietly unifies the last two sections and a chunk of guide 3. If the mark is a keep-or-discard coin, you recover thinning. If the mark is a numeric size and you sum the marks, you recover the compound Poisson process. If the mark is a spatial position, you get a scatter on the plane — exactly the next section.

This makes marking the flexible, all-purpose tool for "random events that each carry data". Want the rate of earthquakes above magnitude 6? Thin the process by the probability that a mark exceeds 6 — and by the thinning rule, those big quakes form their own Poisson process, independent of the small ones. Want the total energy released over a year? Sum the energy-marks, a compound sum. The one assumption to honor, as always, is independence: the clean conclusions need the marks to be independent of the timing and of each other. If big events cluster in time — if one large quake makes the next more likely — the simple marked-Poisson model is the wrong tool, and you reach for a self-exciting model instead.

Off the timeline, onto the map: spatial point processes

Now take the idea of "completely random events" and scatter it across a map instead of along a clock. Stars in a patch of sky, trees in a forest, typos on a page, defects on a silicon wafer, cell-phone users across a city — when points fall with no preference and no interaction, the [[spatial-poisson-point-process|spatial Poisson point process]] is the model. It is the two- and three-dimensional cousin of the Poisson process, and it is the benchmark statisticians mean by "complete spatial randomness".

Its dial is now an intensity lambda, the expected number of points per unit area (or volume). The two defining rules mirror the time case exactly: the number of points in any region A is Poisson with mean lambda times the area of A, and counts in non-overlapping regions are independent. The familiar conditional miracle reappears too — given that exactly n points land in a region, those n points are placed independently and uniformly over it, completely without pattern. And just as in time, the intensity may vary across space as lambda(x), giving a non-homogeneous spatial process whose mean count in A is the integral of lambda over A.

TIME version            <-->   SPACE version
rate lambda (per time)  <-->   intensity lambda (per area)
count in (s,t]          <-->   count in region A
  ~ Poisson(lambda*(t-s))        ~ Poisson(lambda*area(A))
disjoint intervals indep <-->  disjoint regions indep
given n in window:       <-->  given n in region:
  uniform on the window          uniform over the region

The spatial Poisson process is the time process with "length" replaced by "area".

Random scatter clumps — and that fools the eye

Here is the most important honest caveat of the whole spatial picture, and it surprises almost everyone. Under complete spatial randomness, the points still LOOK clustered to the eye. Throw n uniform darts at a map and you will inevitably see clumps here and bald patches there. People expect "random" to mean "evenly spread", but evenly spread is the opposite of random — it is what you get from a regular grid, which is highly ordered. Genuine randomness produces clumps and voids; a pattern with no clumps at all would be evidence of repulsion, not randomness.

This is why the spatial Poisson process is treated as a null hypothesis rather than a discovery. An ecologist seeing apparent clumps of seedlings cannot conclude "the trees attract each other" — random scatter would clump too. The honest question is comparative: are the trees MORE clustered than spatial-Poisson randomness predicts (seedlings sprouting near parents) or MORE regular than it predicts (mature trees competing and spacing themselves out)? You quantify the deviation from the Poisson benchmark; the benchmark itself is the yardstick, not the conclusion. This is the same discipline as everywhere in statistics: a pattern is only meaningful against what pure chance would have produced.

Step back and admire how much one idea has stretched. The same kernel — points sprinkled completely at random, with independent counts in disjoint pieces and Poisson totals — runs through every variant in this guide: a breathing rate in time, a random weight per event, a tag on each point, a scatter across space. The marking theorem ties them into one family, the non-homogeneous version swaps length for area-under-a-curve, and the spatial version swaps time for area. Guide 5 lifts the last constraint of all — that the gaps be exponential — to reach renewal processes, and there we meet the delightfully counterintuitive inspection paradox.