Sample Paths, Stationarity, and Increments

One run draws a whole curve

In the previous guide a stochastic process was introduced as a family of random variables {X_t} indexed by time t, all living on one sample space. The single most important mental shift now is this: when you run the experiment once — pick one underlying outcome omega — you do not get a single number, you get an entire function of time. The recipe t -> X_t(omega) traces out a curve, and that curve is called a sample path (also a realization or trajectory). One omega, one whole path.

Hold two complementary views in your head at once. Freeze the time t and look across all outcomes: X_t is an ordinary random variable, with its own mean E[X_t] and its own distribution, exactly the objects from the earlier rungs. Now instead freeze the outcome omega and let time run: you see one sample path, one concrete history. The process is the cloud of all such paths, weighted by their probabilities. A daily stock-price chart, the wiggling voltage on a noisy wire, the count of buses that have arrived by time t — each picture you have ever seen of "something random over time" is a single sample path drawn from that cloud.

How to specify a process: finite-dimensional distributions

A path is an object with infinitely many coordinates, one for every instant. How could we ever pin down the law of such a thing? The answer is wonderfully economical: we never look at all instants at once. We only ever specify the joint distribution of finitely many snapshots. Pick any finite set of times t_1 < t_2 < ... < t_n and ask for the joint law of the vector (X_{t_1}, X_{t_2}, ..., X_{t_n}). That family of joint laws, one for every finite choice of times, is the collection of finite-dimensional distributions, and it is what truly defines the process.

These pieces cannot be chosen at random; they must agree with one another wherever they overlap. The finite-dimensional distributions obey a consistency condition: if you take the joint law of (X_{t_1}, X_{t_2}, X_{t_3}) and integrate out (marginalize over) the middle coordinate, you must recover exactly the joint law of (X_{t_1}, X_{t_3}). It is the same demand we met for ordinary joint distributions — the marginal computed from a bigger picture has to match the smaller picture — now applied across time. The deep payoff is Kolmogorov's extension theorem: any family of finite-dimensional laws satisfying this consistency condition really does come from an honest process on path space. So consistency is exactly the price of admission, and once paid you may build the whole infinite object out of its finite shadows.

One honest limit is worth flagging now. Finite-dimensional distributions fix what happens at any finite set of times, but they do not by themselves pin down properties that depend on a continuum of times all at once — such as whether the sample paths are continuous, or where a path hits its maximum. Two processes can share every finite-dimensional distribution yet have wildly different path behavior. That is why, for continuous-time processes, we separately insist on path properties (like continuity) on top of the finite-dimensional laws; the snapshots alone are not the whole story.

Stationarity: the statistics that refuse to age

Many processes look statistically the same no matter when you start watching. Background hiss in an audio recording, the daily temperature anomaly once you remove the seasonal trend, a queue at steady state — slide the clock forward by an hour and the probabilistic behavior is unchanged. This invariance under time-shift is stationarity, and it is one of the most useful simplifying structures a process can have. The strong version, strict stationarity, says the entire law is shift-invariant: for every set of times and every lag h, the vector (X_{t_1}, ..., X_{t_n}) has the same joint distribution as the shifted vector (X_{t_1 + h}, ..., X_{t_n + h}).

Strict stationarity is demanding — it constrains every finite-dimensional distribution — so in practice we lean on a weaker, checkable cousin. A process is weakly stationary (or wide-sense stationary) if just its first two moments are time-invariant: the mean E[X_t] is the same constant for all t, and the covariance between two times depends only on the gap between them, not on where they sit. The gap-only covariance is the autocovariance function, gamma(h) = Cov(X_t, X_{t+h}). For a stationary process gamma(h) is a function of the lag h alone. Note the relationship is one-directional: strict stationarity (plus a finite variance) implies weak stationarity, but a weakly stationary process need not be strictly stationary, because matching two moments does not match the whole distribution.

The autocovariance function is the workhorse you actually compute. Three facts make it readable. At lag zero it is just the variance, gamma(0) = Var(X_t), which for a stationary process is one fixed number. It is symmetric, gamma(-h) = gamma(h), since covariance does not care about order. And dividing by the variance gives the autocorrelation, rho(h) = gamma(h)/gamma(0), a number in [-1, 1] that measures how strongly the value now predicts the value h steps later. A tiny example: if successive daily anomalies have variance 4 and Cov(X_t, X_{t+1}) = 2, then rho(1) = 2/4 = 0.5 — today's anomaly carries half a unit of linear pull on tomorrow's.

Increments: the other way a process can be regular

Stationarity tames the levels of a process; the other great structural assumption tames its changes. An increment is the change over an interval, X_t - X_s for s < t — how much the process moved between two times. A huge family of important processes is built by demanding that these increments be well-behaved in two specific ways, and the combination is named independent stationary increments.

Two clauses, kept carefully apart. Independent increments: changes over non-overlapping time intervals are independent random variables — what the process does between 1pm and 2pm tells you nothing about what it does between 5pm and 6pm. Stationary increments: the distribution of an increment depends only on the length of the interval, not on where it sits — the change over any one-hour window has the same law, whether that window starts at noon or at midnight. Mind the subtlety: "stationary increments" is a statement about the increments' distribution and is a quite different thing from the process itself being stationary. A process can have stationary increments while drifting steadily upward, so it is plainly not stationary in level.

Disjoint intervals on the time line:

  ---[ s1 .... t1 ]--------[ s2 .... t2 ]--->  time
        change A                change B

Independent increments:  A and B are independent.
Stationary increments:   law of (X_t - X_s) depends only on (t - s).

Do NOT confuse with the process being stationary
(that is about X_t's levels, not its changes).

The two clauses live on disjoint intervals, and neither one is the same as the process itself being stationary.

These two clauses are the engine room of the rest of this rung. Add them to a process that takes integer steps of plus or minus one each tick and you get the simple random walk, the star of the next guide: each step is an independent, identically distributed jump, so the position after n steps is just a running sum of independent moves. Add the same clauses to a counting process and you get the Poisson process; add them with normally distributed increments and a continuity requirement and you get Brownian motion much later. The mathematics is uniform because the structure is the same — independent stationary increments turn a process into a tidy accumulation of fresh, interchangeable surprises.

Reading a process well

Put the pieces together and you have a working toolkit for any process you meet. First locate the sample path: it is one realization, one history, not the law. Then remember the law itself is captured by the finite-dimensional distributions, which must satisfy the consistency condition to be legitimate. Then check the two big structural questions, which are independent of each other. Is it stationary — do the level statistics, summarized by the mean and the autocovariance function, refuse to depend on absolute time? And does it have independent and/or stationary increments — are its changes a clean accumulation of interchangeable, non-interfering pieces?

Identify the index set and state space, then sketch a single sample path so you remember you are looking at one realization, not the whole law.
Write the law as finite-dimensional distributions over chosen time-snapshots, and confirm they marginalize consistently (the consistency condition).
Test stationarity: is E[X_t] constant, and does Cov(X_t, X_{t+h}) depend only on the lag h? If yes, summarize the second-order behavior by the autocovariance function gamma(h).
Test the increments separately: are changes on disjoint intervals independent, and does the law of X_t - X_s depend only on t - s? Independent stationary increments point you toward random walks and Poisson-type processes.

Carry one honest caution forward. Real data hands you a single sample path, yet the questions above are about the whole ensemble of paths — about the law. We can only estimate stationarity or autocovariance from one path when the process is well-behaved enough that time-averages along the path stand in for ensemble-averages across paths; that is the ergodic idea, and it is an extra assumption, not a free gift. Keep the path and the law distinct in your mind, and the next guide's random walk will read as exactly what it is: independent stationary increments, summed.