Frequency & Severity: Splitting the Problem

Why not just model the total?

On the life side of this ladder you mostly modelled *whether* and *when* one event happened — death, survival to a date — and the payout was usually a fixed sum you had agreed in advance. Non-life insurance is a different animal. A car policy might generate zero claims this year, or one fender-bender, or a fender-bender plus a stolen laptop plus a total write-off. And the cost of any one claim is itself uncertain: a scratch is a few hundred dollars, a totalled car is tens of thousands. So the thing you ultimately care about — the aggregate loss, the total a policy or a portfolio costs over the year — is built from two separate kinds of randomness stacked on top of each other.

You could, in principle, try to model the aggregate directly: collect the total cost of every policy last year, and fit one distribution to that pile of numbers. The trouble is that this pile is a horror to describe. It has a fat spike at exactly zero (most policies never claim), then a smooth hump for policies with a single modest claim, then a long, thin, frightening tail for the rare policies that had a catastrophe or several claims at once. No tidy textbook distribution looks like that, and worse, the moment your business changes — you raise the deductible, you write a different mix of customers — the whole misshapen pile shifts and you must start over.

The fix is the single most important idea in this whole rung, and it is almost embarrassingly simple: stop asking "how much will this policy cost?" as one question. Ask two. How many claims will it have, and given a claim, how large will it be? This is the frequency–severity decomposition, and once you see it you cannot unsee it — it organises P&C pricing, reserving, and risk theory all at once.

Two clean questions instead of one ugly one

Frequency is the count of claims in a period — a whole number: 0, 1, 2, 3, … Because it is a count of fairly rare, roughly independent events, it lives naturally in the family of discrete distributions you already met. The default workhorse is the Poisson, which has the lovely property that its mean and variance are equal; when real data show more spread than that — more years of zero and more years of three than Poisson allows — actuaries reach for the negative binomial, which adds exactly that extra wobble. The whole object is a claim-frequency distribution.

Severity is the size of one claim, *given that a claim happened* — a positive amount that can be anything from a token sum to a ruinous one. So it lives in the family of continuous distributions on the positive numbers. For moderate, well-behaved costs a lognormal or gamma fits nicely; for lines where the occasional claim is monstrous — liability, property catastrophe — you need a heavy-tailed shape such as the Pareto, whose tail decays so slowly that a single claim can dwarf the sum of all the others. This object is the claim-severity distribution. Crucially, frequency and severity are usually modelled as *independent*: how many claims you have does not tell you how big each will be. That independence is an assumption, not a law of nature — but it is a very useful and usually defensible one.

Stitching them back together

Splitting the problem only helps if you can reassemble it. The total cost of a policy is: take the random number of claims N, draw that many independent severities X₁, X₂, … from the severity distribution, and add them up. A sum of a *random number* of random amounts is called a compound distribution — and when the count N is Poisson, it is the celebrated compound Poisson that underlies the entire collective risk model. You will spend the next guides learning to compute its mean, its variance, and even its full shape; here the point is only that the two halves recombine into the one quantity that matters.

Aggregate loss  S = X1 + X2 + ... + XN   (N is itself random)

Expected frequency  E[N] = 0.20 claims/policy/year
Expected severity   E[X] = 4,000 dollars/claim

Pure premium  E[S] = E[N] x E[X]
            = 0.20 x 4,000 = 800 dollars/policy/year

A toy auto book: one claim every five years on average, $4,000 a claim, so the expected pure cost is $800 per policy per year — before expenses, profit, or any safety margin.

Notice what the split bought us in that little calculation. The $800 came from estimating two things separately — a rate of about one claim every five years, and a typical claim of about $4,000 — each of which we can study with its own data and its own distribution. If next year regulators force a 10% rise in repair costs, only the severity number moves; the frequency stays put. If a new safety law cuts crashes by a fifth, only the frequency moves. We can update one half without disturbing the other, which is exactly the flexibility the ugly all-in-one model could never give us.

Per-loss versus per-payment: whose point of view?

Real policies almost never pay the whole claim. A deductible makes the customer absorb the first slice; a policy limit caps what the insurer will pay at the top. This forces a question that trips up almost everyone at first: when you talk about "the severity distribution," do you mean the size of the *loss the customer suffered*, or the size of the *payment the insurer actually made*? Those are two genuinely different distributions, and confusing them quietly poisons a pricing model. The distinction has a name — the per-loss versus per-payment viewpoint.

The per-loss view stands beside the policyholder and looks at *every* loss event, including the ones too small to ever reach the insurer. Under a $500 deductible, a $300 windscreen chip is a real loss but produces a payment of zero — it still counts as a loss in the per-loss picture, just a zero-payment one. The per-payment view stands at the insurer's claims desk and looks only at events that actually generated a cheque: it has *already* dropped every loss below the deductible, so its frequency is lower and the amounts you see are conditioned on "big enough to pay." Same underlying reality, two different lenses — and the lens you pick must match the question you are answering.

Here is the subtlety that makes this worth a section of its own. Raising a deductible does not just shave a fixed amount off each payment — it also changes the *frequency* you observe, because losses that used to clear the threshold now vanish from the per-payment data. Both halves of the decomposition move at once. That is why a careless analyst who fits a severity curve to the cheques actually written, and then applies it as if it described all losses, will badly misprice a policy with a different deductible. Keeping the two viewpoints straight is not pedantry; it is the difference between a model that travels across deductibles and limits and one that silently breaks the moment the contract terms change.

Honest limits of the split

The decomposition is powerful precisely because it makes strong simplifying assumptions, and an honest modeller keeps a finger on each one. We assume claims are *roughly independent of each other* — but a hailstorm or a hurricane shatters that overnight, smashing a thousand roofs in one afternoon, so frequency and severity both spike together and the neat product formula understates the danger. We assume frequency and severity are *independent of each other* — but in inflationary times the same forces that raise repair costs can also nudge how often small claims get reported. And we assume the distributions we fitted *keep holding* — yet fitting a curve to last year's data is a description of the past, not a guarantee about the future.

Whenever you face a total cost, refuse to model it head-on — first split it into how many (frequency) and how big (severity).
Tag every number with its units: a count for frequency, dollars for severity — and remember their product is only the *expected* cost, not the whole risk.
Before quoting a severity figure, ask "per loss or per payment?" — and check that the deductibles and limits in the data match the policy you are pricing.