Bühlmann-Straub & Varying Exposures

The hidden assumption in plain Bühlmann

In the previous guide you met Bühlmann credibility and its tidy verdict: trust your own data with weight Z = n / (n + k), where n is how many observations you have and k is the ratio of within-risk noise to between-risk signal. There was a quiet assumption hiding in that little n, though. It counted observations as if each one were the same size — one driver, one year, one identical unit of risk. That is fine for a textbook fleet of identical drivers, but it is almost never true of real business.

Picture an insurer's book of group health plans. One employer covers 5,000 lives; the corner bakery next door covers 8. One year you wrote a plan for nine months, the next for a full twelve. A claims average built on 5,000 lives over a full year is a steady, trustworthy thing; the same statistic from 8 lives over nine months is a coin-flip. Plain Bühlmann has no way to say so — to it, both are simply 'one observation' and both would push Z the same way. We need a model that knows that bigger observations are quieter and ought to count for more.

Exposure: the right unit of size

The cure is to measure size honestly, and the actuary's word for size is exposure: the count of risk units actually at risk during a period — life-years, car-years, payroll dollars, number of policies. A plan covering 5,000 lives for a full year carries 5,000 life-years of exposure; the 8-life bakery for nine months carries 8 × 0.75 = 6 life-years. Exposure is the natural ruler because the random scatter of an average shrinks as exposure grows: more lives, less luck.

The Bühlmann-Straub model is simply Bühlmann rebuilt on exposure instead of a naive count. It does two things. First, when it summarises a risk's own history into a single number, it uses an exposure-weighted average — a year with 5,000 life-years pulls that average far harder than a year with 6. Second, it replaces n in the credibility weight with the total exposure m, giving Z = m / (m + k), keeping the very same k = EPV / VHM you already know. Everything you learned about k carries over untouched; only the meaning of the number on top has grown up.

Working a credibility premium

Let us price next year for one commercial group, the way an actuary actually does it. Suppose the empirical-Bayes step has already given us k = 800 (in life-years) and a collective mean of 1,000 per life — the rate for a brand-new account with no record of its own. Our group has accumulated 3,200 life-years of exposure, and over that history its own exposure-weighted average claim has run at 1,150 per life. The question is the same as ever: how much of that 1,150 do we trust?

Z = m / (m + k) = 3200 / (3200 + 800) = 0.80
credibility premium = Z * own + (1 - Z) * collective
                    = 0.80 * 1150 + 0.20 * 1000 = 1120 per life

Exposure m = 3,200 against k = 800 earns Z = 0.80, so the account's own experience drives four-fifths of its rate.

That 1,120 is the credibility premium — the exposure-weighted blend of the group's own 1,150 and the collective 1,000. Now run the same arithmetic for the tiny bakery, with only 200 life-years: its Z is 200 / (200 + 800) = 0.20, so 80 percent of its rate comes from the collective and only a fifth from its own thin, noisy record. Exposure has done exactly what we wanted — the large, information-rich account speaks mostly for itself, while the small one is gently but firmly pulled toward the crowd.

Where it earns its keep: experience rating

Bühlmann-Straub is the workhorse behind experience rating of commercial accounts. When a group's policy comes up for renewal, the underwriter does not invent a number; a credibility engine blends the account's own loss history with the manual rate for its class, and the weight is set by how much exposure the account brings. A 5,000-life employer with three clean years sees that good record reflected in a discount it genuinely earned. A small shop with one freak claim is not punished for a single unlucky year, because its low Z keeps it tethered to the class rate.

The same machine shows up wherever exposures vary widely from risk to risk and year to year: workers' compensation experience modifications, group life and health renewals, and reinsurance, where a treaty's exposure can swing enormously between accounts. In each case the parameters k and the collective mean are estimated once across the whole portfolio — the empirical-Bayes step you met earlier — and then every individual risk is rated by its own exposure. One structural model, many tailored premiums.

Estimate the structural parameters (EPV, VHM, hence k) and the collective mean once, across the whole portfolio.
For each account, total its exposure m and form its exposure-weighted average claim.
Set Z = m / (m + k); a bigger account automatically earns a higher Z.
Blend: credibility premium = Z × own average + (1 − Z) × collective mean.

Honest limits

Bühlmann-Straub leans on one structural assumption worth stating plainly: it assumes each risk's process variance is inversely proportional to its exposure, the clean Poisson-style scaling where doubling the lives halves the variance of the average. That holds beautifully for independent, homogeneous risk units. It frays when the units are not independent — a single factory fire can hit hundreds of 'lives' at once, so a 5,000-life account is not really 5,000 independent coin flips, and its true volatility shrinks slower than the model assumes. When that happens, the model hands a large account more credibility than it has truly earned.

Two more honest cautions. First, the exposure measure must be the right one — payroll, headcount, sales — and measured correctly; if an account is 'large' only on paper, its high Z will confidently mislead. Second, what comes out is still a credibility premium: a best estimate of expected loss, not a finished price. It must still be loaded for expenses, profit and risk margin before it becomes a gross premium anyone can sell. The credibility blend answers 'what will this cost?', not 'what should we charge?' — those are different questions.