What Is a Stochastic Process?

From one gamble to a whole story

Everything in the earlier rungs revolved around a single random variable X, or a fixed bundle of a few of them. X was a snapshot: one number drawn by chance, with a distribution describing the spread of possible values. But the world rarely hands you a single snapshot. A stock price wanders all day long; a queue grows and shrinks minute by minute; a gambler's purse rises and falls bet after bet. To model these we need not one random number but a whole sequence — indeed a whole *family* — of random variables, indexed by time. That family is a stochastic process.

Formally, a stochastic process is a collection of random variables { X_t } — one variable X_t for each value of an index t, all defined on the same underlying probability space. The index t almost always plays the role of time, so we read X_t as "the state of the system at time t". The whole point is that these X_t are *not* assumed independent: how the process behaves at later times is tangled up with what it did earlier, and capturing that dependence-through-time is exactly what makes a process richer than a mere list of separate random variables.

Three things every process needs: index, state, and path

To pin a process down you must say three things. First, the index set: the set of times t you allow. The index set can be discrete — t = 0, 1, 2, 3, ... for bets or daily closing prices — giving a *discrete-time* process; or it can be a continuous interval like t >= 0 for a price ticking in real time, giving a *continuous-time* process. Second, the state space: the set of values each X_t may take. The state space might be a finite set (sunny / rainy), the integers (your purse in whole dollars), or all the real numbers (a temperature). The pair (discrete-or-continuous time, discrete-or-continuous state) already sorts processes into broad families before you write a single formula.

Third — and this is the picture that makes processes click — comes the sample path. Fix one outcome omega, that is, imagine the entire experiment has been run once to the end. Then for that single omega every X_t(omega) becomes an actual number, and as t sweeps through the index set you trace out a whole curve. That curve is a sample path (also called a realization or a trajectory): one possible complete history of the process. Where a random variable hands you one number per outcome, a stochastic process hands you an entire *function of time* per outcome.

A tiny concrete process you can hold in your hand

Make it real with the smallest interesting example. Toss a fair coin once per second forever. Let S_0 = 0, and at each step add +1 for heads, -1 for tails, so S_n is your running total after n tosses. The index set is t = 0, 1, 2, ...; the state space is the integers; and each X_n in the general notation is this S_n. One sample path might read 0, +1, 0, -1, 0, +1, +2, ... — a single zig-zag history determined once the infinite sequence of coin tosses (the outcome omega) is fixed. This is the process we will name and study next: the simple random walk.

Now read it through both slices. Fix the time n = 2 and let the coin tosses vary: S_2 is a plain random variable taking the value +2 with probability 1/4 (two heads), 0 with probability 1/2 (one of each), and -2 with probability 1/4 (two tails). That is the vertical slice — a distribution. Fix instead one omega, say the toss sequence H, T, H, H, ...: now the path is the fully determined zig-zag 0, +1, 0, +1, +2, .... Same process, two faces.

time  n :   0    1    2    3    4    5   ...
toss     :        H    T    H    H    T  ...
path S_n :   0   +1    0   +1   +2   +1  ...

VERTICAL slice  (fix n=2, vary the coins):
   S_2 = +2  with prob 1/4
   S_2 =  0  with prob 1/2
   S_2 = -2  with prob 1/4

HORIZONTAL slice (fix this omega, vary n):
   the one zig-zag path  0, +1, 0, +1, +2, +1, ...

One coin-flip walk, seen as a vertical slice (a distribution at a fixed time) and a horizontal slice (a single sample path).

How a process is fully described: finite-dimensional distributions

A process has infinitely many random variables in it, so how could we ever write down its full law? The answer is the central organizing idea of the subject. We never need the whole infinite object at once; it is enough to know, for every finite list of times t_1, t_2, ..., t_k, the joint distribution of (X_{t_1}, X_{t_2}, ..., X_{t_k}). These joint laws are the finite-dimensional distributions, and together they pin the process down. Knowing all the finite-dimensional distributions is, for almost every purpose, knowing the process.

These finite-dimensional distributions cannot be chosen at random; they must agree where they overlap. If you know the joint law of (X_1, X_2, X_3) and then ask only about (X_1, X_3), you must get the same answer by marginalizing out X_2 as you would by computing the (X_1, X_3) law directly. That overlap-agreement requirement is the consistency condition. Kolmogorov's extension theorem then makes the beautiful promise: any family of finite-dimensional distributions that is consistent really does come from a genuine stochastic process living on a single probability space. So you may build a process simply by specifying its finite-window behavior, as long as you stay consistent.

Three structural patterns to listen for

Processes become tractable when their finite-dimensional distributions have extra structure. Three patterns, all studied across this rung, are worth naming now. The first is stationarity: the process looks statistically the same no matter when you start watching. Strictly, stationarity means the joint law of (X_{t_1}, ..., X_{t_k}) is unchanged if you shift every time by the same amount h, giving (X_{t_1+h}, ..., X_{t_k+h}). A weaker, very common cousin asks only that the mean stay constant and that the autocovariance Cov(X_s, X_t) depend only on the gap t - s, not on s and t separately — the heart of time-series modeling.

The second pattern concerns *increments*, the jumps X_t - X_s between two times. A process has independent and stationary increments when disjoint stretches of time produce independent jumps, and when the law of a jump depends only on the length of the time interval, not on where it sits. Our coin-flip walk has exactly this flavor: the change over steps 5 to 8 is independent of the change over steps 1 to 3, and depends only on how many steps elapsed. This single assumption is the seed of the most important processes you will meet — the random walk, the Poisson process, and Brownian motion all grow from it.

The third and most far-reaching pattern is the Markov property: a kind of forgetting. A process is Markov when, given its present value, its future is conditionally independent of its entire past — the present state summarizes everything you need to predict what comes next. The Markov property does not say the future is independent of the past; it says the past influences the future *only through* the present. Our walk obeys it: once you are told you are standing at +3 right now, the path that brought you there tells you nothing extra about your next step. This is the engine behind the chains and processes of the coming rungs, and it deserves — and will get — a guide of its own.