Reading the cdf backwards
The previous guide built the cumulative distribution function F(x) = P(X <= x): feed it a value, and it returns the probability of landing at or below that value. It runs left to right — value in, probability out. This final guide asks the natural reverse question. Suppose you want the value below which 90% of the probability sits. Now you are handing in a probability and asking for a value. Reading the cdf in this backwards direction is the whole idea of a quantile, and it turns out to be just as useful as the forward reading.
Concretely, the p-th quantile is the value q such that P(X <= q) = p — the point that cuts off probability p to its left. For a smooth, strictly increasing cdf this is exactly the inverse function: q = F^(-1)(p). Picture the S-shaped graph of F. The forward reading goes up from the x-axis to the curve and across to the y-axis; the quantile reading goes the other way, in from p on the y-axis to the curve and down to the x-axis. Same curve, read in two directions. The quantile function Q(p) = F^(-1)(p) is the official name for that downward-and-across reading.
The median and the percentiles
The most famous quantile has its own name: the median is the 0.5-quantile, the value with half the probability on each side, Q(0.5). It is the great rival of the mean E[X] you met earlier as a measure of the centre. The two often disagree, and the reason is one of the honest cautions of this subject: the median ignores how far away the values are, while the mean is pulled by them. Skew a distribution with one enormous value and the mean lurches toward it while the median barely moves — exactly the situation flagged in when the mean misleads.
A tiny example makes it vivid. Take five salaries: 30, 32, 35, 38, and 800 (thousands). The mean is (30+32+35+38+800)/5 = 187, a figure nobody in the list actually earns — dragged up by the single 800. The median is the middle value, 35, which honestly represents a typical person. This is exactly why house prices and incomes are almost always reported by median, not mean: the median is robust, shrugging off a handful of extreme outliers that would wreck the average.
Other percentiles generalize the same cut. The 90th percentile is Q(0.9), the value 90% of the probability falls below; quartiles split the distribution into quarters at Q(0.25), Q(0.5), Q(0.75). The gap between the first and third quartiles, the interquartile range Q(0.75) - Q(0.25), is a robust measure of spread — the median's natural partner, just as standard deviation partners the mean. When a child is at the 60th percentile for height, that is a quantile statement: 60% of children that age are shorter.
When the inverse misbehaves
Saying q = F^(-1)(p) is clean only when F is strictly increasing and continuous. From the previous guide you know F need not be either. A cdf for a discrete variable climbs in jumps and stays flat between them, and even a continuous one can have flat stretches where the variable places no probability. Both cases break the simple inverse, so we need a definition of the quantile that always works — never crashing, never ambiguous.
The standard fix is the generalized inverse: Q(p) = the smallest value x for which F(x) >= p. In words, slide rightward until the accumulated probability first reaches p, and stop. On a flat stretch — where many x-values share the same F — this rule picks the leftmost one, removing the ambiguity. At a jump — where F leaps over the level p with no x exactly hitting it — it picks the x at the top of the jump, so the quantile is always defined. This is the careful version of the quantile function that statisticians actually use.
Generalized inverse (works for every cdf):
Q(p) = min { x : F(x) >= p } for 0 < p < 1
Die example, X = roll of a fair 6-sided die:
F jumps by 1/6 at 1,2,3,4,5,6
Q(0.5) = smallest x with F(x) >= 0.5
= 3 (since F(3) = 3/6 = 0.5)
Q(0.9) = smallest x with F(x) >= 0.9
= 6 (since F(5) = 5/6 ~ 0.833 < 0.9, F(6) = 1)
Median is not unique here: any value in [3,4] splits 50/50,
but the rule pins it to 3 by taking the leftmost.The survival function: flipping the cdf over
The cdf answers "how much probability is at or below x?" Sometimes you want the opposite tail — "how much is above x?" That is the survival function, S(x) = P(X > x) = 1 - F(x). It is the cdf flipped over: where F starts at 0 and rises to 1, S starts at 1 and falls to 0. The name comes from its birthplace. If X is how long a machine, a patient, or a lightbulb lasts, then S(t) = P(X > t) is the probability it is still going strong past time t — the probability it survives. Reliability engineers call the very same object the reliability function.
Why bother, when S is just 1 - F? Because the tail is where the action is in lifetime problems, and S states it directly. "What fraction of components outlast their 5-year warranty?" is simply S(5). Quantiles read just as naturally off the survival curve: the median lifetime is the time t with S(t) = 0.5, the moment half the population has failed. And tail questions like P(X > a) read straight off as S(a), no subtraction needed in your head. The cdf and survival function carry identical information — they are two faces of one distribution — but each makes a different question effortless.
Putting the quantile function to work
The quantile function is not only for reporting — it is a genuine tool, and one beautiful use ties this whole rung together. Suppose you can generate uniform random numbers between 0 and 1 (every computer can) but you want samples from some other distribution with cdf F. Feed a uniform U into the quantile function: the value Q(U) = F^(-1)(U) has exactly the distribution F. This is the inverse-transform method, and it falls out of the probability integral transform from earlier in your studies.
- Pin down the target. Say you want samples from the exponential with rate lambda, whose cdf is F(x) = 1 - e^(-lambda x) for x >= 0.
- Invert the cdf by hand. Set p = 1 - e^(-lambda x) and solve for x: e^(-lambda x) = 1 - p, so x = -ln(1 - p) / lambda. That is the quantile function Q(p).
- Draw a uniform U on (0, 1) from your random-number generator.
- Return X = -ln(1 - U) / lambda. This X is a genuine exponential sample; repeat to get as many as you like.
Why does this magic work? Intuitively, the quantile function stretches and squeezes the flat uniform until its accumulated probability matches F: regions where F rises steeply (high density) get more of the uniform's length mapped into them, so samples cluster there. That is the same backwards reading of the cdf you have used all guide, now driving a simulation. And it closes the loop on the rung's big themes — the random variable as a function, the cdf as its forward summary, and the quantile and survival function as the two ways of reading that summary in reverse.