Probability Basics: Events, Conditioning & Bayes

From pooled risk to measured chance

In the Foundations rung you saw why an insurer can pool many small, unpredictable losses into one large, predictable whole — the heart of insurance and the reason an insurable risk must be reasonably independent across policyholders. But "predictable" is a promise we have to keep with numbers. To price a policy, set a premium, or hold a reserve, an actuary must say how *likely* a loss is and how *big* it tends to be. That single word — likely — is what probability makes precise.

Probability is the grammar of uncertainty. We will not memorise it as a wall of formulas; instead we will build the intuition first and let each formula arrive as a sentence in that grammar. By the end you will own three tools that recur through the whole actuarial curriculum: the rules that any sensible probability must obey, the art of *conditioning* on what you know, and Bayes' theorem — the machine for changing your mind when the world hands you a fresh clue.

Sample spaces and events: the map of what could happen

Before we can measure chance we must list what chance is choosing among. The full list of every outcome that could occur in some experiment is the [[sample-space|sample space]]. Roll one die and it is the six faces {1,2,3,4,5,6}; ask whether a policyholder files a claim this year and it is just {claim, no claim}. The sample space is the territory; everything else is drawn on top of it.

An event is simply any collection of outcomes we care about — a region on that map. "The die shows an even number" is the event {2,4,6}. Because events are sets, we combine them with the operations of sets: the *union* A∪B ("A or B happens"), the *intersection* A∩B ("both happen"), and the *complement* ("A does not happen"). An actuary lives in this language: "a fire claim *and* a theft claim in the same year" is an intersection; "any claim at all" is a union of every loss type.

The three rules every probability must obey

Probability assigns each event a number measuring how strongly we expect it. To stop us assigning nonsense, mathematicians pinned down the smallest set of common-sense rules — the [[probability-axioms|probability axioms]] — from which everything else follows. First, no probability is negative. Second, the whole sample space has probability 1: *something* in the list must happen. Third, for events that cannot overlap, probabilities simply add: P(A or B) = P(A) + P(B) when A and B are mutually exclusive.

These three lines are deceptively powerful. From them alone we get the complement rule — P(not A) = 1 − P(A), the actuary's favourite shortcut: if the chance of *no* claim is 0.92, the chance of *at least one* claim is 0.08, no extra work needed. And when events *can* overlap, the addition rule corrects itself so we do not double-count the overlap: P(A∪B) = P(A) + P(B) − P(A∩B).

Conditioning and independence: what new information does

Real underwriting is never done in a vacuum — you always know *something* about the risk in front of you. [[conditional-probability|Conditional probability]], written P(A | B) and read "the probability of A given B", captures exactly that: how likely A becomes once we have learned that B is true. The mechanics are intuitive — we shrink the sample space down to only the worlds where B happened, then ask what fraction of *those* worlds also contain A: P(A | B) = P(A∩B) / P(B).

Picture 1,000 drivers, of whom 100 are under 25. Suppose 60 drivers crash this year, and 30 of those are the young ones. The unconditional crash rate is 60/1000 = 6%. But *given* a driver is under 25, the rate jumps to 30/100 = 30%. The age told us something — that is conditioning at work, and it is precisely why insurers rate by risk class rather than charge everyone the same.

Two events are [[statistical-independence|independent]] when learning one tells you nothing about the other — formally, P(A | B) = P(A), equivalently P(A∩B) = P(A)·P(B). Independence is the quiet assumption holding up the whole insurance edifice: pooling tames risk only because policyholders' losses are *roughly* independent. When that fails — a hurricane, a pandemic, a market crash hitting thousands of policies at once — losses arrive together, the comforting averaging breaks down, and the pool can be overwhelmed. Honest modelling means asking, every time, whether independence truly holds.

Bayes' theorem: changing your mind, honestly

Conditioning usually runs in the direction the data flow: a high-risk driver is more likely to crash. But an insurer often needs to run it *backwards* — having seen a crash (or a claim, or a positive medical test), how should we revise our belief about the hidden cause? Flipping that arrow is exactly what [[act-bayes-theorem|Bayes' theorem]] does. It is less a formula to fear than a disciplined recipe for updating: start with what you believed *before* the evidence (the prior), weigh it by how well each hypothesis explains the evidence (the likelihood), and renormalise.

Here is the insurance flavour. Suppose 10% of new motor applicants are genuinely high-risk. A high-risk driver files a claim in the first year with probability 50%; a standard driver, only 10%. A new policyholder files a claim. What is the chance they were high-risk all along? Of every 1,000 applicants, 100 are high-risk and 900 standard. High-risk claimants: 100 × 0.5 = 50. Standard claimants: 900 × 0.1 = 90. So 50 + 90 = 140 claims arise, and 50 of them came from high-risk drivers.

P(high-risk | claim) = (0.10 x 0.50) / (0.10 x 0.50 + 0.90 x 0.10)
                     = 0.050 / (0.050 + 0.090)
                     = 0.050 / 0.140
                     = 0.357  (about 36%)

The prior belief of 10% high-risk rises to about 36% once a claim is observed — but it is far from certainty.

Notice the honest lesson buried in that 36%. One claim is real evidence — our belief more than tripled, from 10% to 36% — yet most claimants are still standard drivers who simply had a bad year. Punishing every claimant as definitely high-risk would be both unfair and statistically wrong. This Bayesian habit of updating *gradually* in proportion to evidence is the seed of credibility theory, where an insurer blends a policyholder's own experience with the broader pool — a tool you will meet much later in the ladder, and a random variable view of the same loss is what the next guides build on.

Putting the toolkit to work

Everything in this guide is a single reasoning loop the actuary repeats endlessly. It runs like this, and the rest of the Probability rung simply equips each step with sharper tools.

Name the sample space — list, even loosely, every outcome the risk could produce.
Define the events that matter — a claim, a loss above a deductible, a death within the year — as sets of those outcomes.
Assign probabilities that obey the axioms, then condition on whatever you genuinely know about the risk.
When new evidence arrives, use Bayes to update your belief in proportion — never overreact to a single data point.

Hold on to one humility throughout: every probability we write is itself an estimate, drawn from data and judgement, not a law of nature. A model that assigns crisp numbers can feel more certain than it deserves. The skilled actuary uses these tools precisely *and* remembers they are tools — the map, never quite the territory. Carry that double discipline into the distributions and expectation that come next.