The Cumulative Distribution Function

One function for every random variable

In the last three guides you met a random variable as a function on the sample space, then two ways to describe its distribution: the probability mass function for discrete values, and the density for continuous ones. Those two tools are powerful but parochial — each works only for its own kind of variable, and neither can speak about a variable that is part-discrete, part-continuous. The cumulative distribution function, or cdf, fixes this. It is defined for every real-valued random variable, full stop, by a single accumulating question.

The question is: how much probability has piled up at or below the level x? Formally, F(x) = P(X <= x). Read it as a running total. As you slide a threshold x from far left to far right along the number line, F(x) sweeps up all the probability the variable has accumulated so far. At the extreme left, before any of X's possible values, the total is 0; at the extreme right, after all of them, the total is 1. The cdf is that sweep, recorded as a function.

What every cdf must look like

Not every function can be a cdf. The defining properties follow directly from F being an accumulated probability, and they are worth knowing because together they completely characterize cdfs: any function with all of them is the cdf of some random variable, and any cdf has all of them. They also give you a fast sanity check on any formula someone hands you.

1.  Bounded:        0 <= F(x) <= 1   for every x
2.  Non-decreasing: if a <= b  then  F(a) <= F(b)
3.  Limits:         F(x) -> 0  as x -> -infinity
                    F(x) -> 1  as x -> +infinity
4.  Right-continuous: F(x) = lim of F(x + h) as h -> 0 from above

Useful consequence:
    P(a < X <= b) = F(b) - F(a)

The four defining properties, plus the interval formula they make possible.

Each property has a plain meaning. Non-decreasing is just "a running total can never go backward" — you cannot un-accumulate probability. The limits to 0 and 1 say the total probability is exactly 1, spread somewhere along the line; this is the continuity of probability applied to shrinking and growing half-lines. Right-continuity is the subtle one: because we used X <= x, the value of F at a jump includes the probability sitting exactly at that point, so F catches up to its higher value the instant you reach the point from the right.

That single interval formula, P(a < X <= b) = F(b) - F(a), is the cdf's everyday workhorse. Want the chance X lands in a window? Subtract the running total at the bottom of the window from the running total at the top. You never need the pmf or pdf to answer an interval question — the cdf alone does it, which is part of why the cdf is the most universal description of a distribution.

How the cdf carries the pmf and the pdf inside it

The cdf does not throw away the information in the pmf or pdf — it stores it in its shape, and you can recover either one. For a discrete variable the cdf is a staircase: flat on stretches where X has no mass, then jumping up at each value X can take. And here is the lovely part: the height of the jump at a point equals the probability mass there. So the pmf is literally the size of the cdf's jumps. P(X = x) = F(x) - F(x-), where F(x-) is the limit from the left.

For a continuous variable the cdf has no jumps at all — it rises smoothly. Where the density is tall, the cdf climbs steeply; where the density is near zero, the cdf is nearly flat. That is no accident: the cdf is the running integral of the density, F(x) = integral of f(t) dt from -infinity to x, so its slope is the density. Differentiate the cdf and you get the pdf back: f(x) = F'(x) wherever F is differentiable. The density is the rate at which probability accumulates, which is exactly the cdf's slope.

Points, jumps, and the part nobody warns you about

Here is a place beginners trip. For a continuous variable, P(X = c) = 0 for any single point c — the cdf has no jump there, and F(c) - F(c-) = 0. This is the same fact you met for densities: a single point holds no probability, only intervals do. A consequence that feels strange but is correct: for a continuous X it does not matter whether endpoints are included, so P(a < X < b) = P(a <= X <= b) = F(b) - F(a). The four versions with and without equals signs all agree, because the boundary points contribute nothing.

For a discrete variable the opposite holds: endpoints matter enormously, because the mass at a point is a real jump. There, P(X = c) > 0 and you must be careful whether a point is included. This is exactly why the interval formula is stated as P(a < X <= b) = F(b) - F(a): with the <= convention, F(b) includes the mass at b but F(a) does not include the mass at a, so subtracting gives the half-open interval (a, b]. Change the inclusion of an endpoint for a discrete variable and the answer can change by a whole lump of probability.

And now the unifying payoff. A mixed distribution is one that is continuous in some places and has point mass in others — and its cdf simply has both behaviors at once: smooth rising stretches plus a few isolated jumps. Think of a rainfall variable that is 0 on dry days (a real lump of probability at exactly 0) but takes a continuous spread of positive values on wet days. No pmf describes it, no pdf describes it, but a single cdf does — climbing smoothly above 0 yet jumping at 0. That generality is the cdf's quiet superpower.

Reading a cdf, and a peek at what it unlocks

Let us read a tiny discrete cdf to make all this concrete. Suppose X is the number of heads in two fair coin tosses, so X takes 0, 1, 2 with masses 1/4, 1/2, 1/4. Build its cdf by accumulating left to right, then read probabilities straight off the staircase.

Below 0 nothing has accumulated: F(x) = 0 for x < 0.
At 0 a mass of 1/4 lands, so the cdf jumps to F(0) = 1/4 and stays there up to (but not including) 1.
At 1 add the mass 1/2: F(1) = 1/4 + 1/2 = 3/4, holding until just before 2.
At 2 add the last 1/4: F(2) = 3/4 + 1/4 = 1, and it stays 1 forever after. Check a jump: P(X = 1) = F(1) - F(1-) = 3/4 - 1/4 = 1/2. Correct.

Because the cdf is the universal description, it is also the gateway to ideas waiting just ahead. Reading the staircase or curve *backwards* — asking "which x has F(x) = 0.5?" — gives the quantile function, the inverse of the cdf, which delivers the median and percentiles you will meet in the next guide. And flipping the question to "how much is above x?" gives 1 - F(x), the survival function. Both are nothing more than the cdf viewed from a new angle, which is precisely why this one function is worth mastering before the others.