Sampling With and Without Replacement

Two questions that classify every counting problem

Picture an urn holding n distinct balls, and you reach in to draw k of them. That single picture — drawing k items from a pool of n — is the hidden skeleton inside most counting questions you will ever face. The earlier guides gave you the parts: the multiplication principle, permutations, and combinations. Sampling is the frame that snaps them together, and it does so by asking just two yes/no questions about how you draw.

The first question is whether order matters. If drawing ball 3 then ball 7 is a different result from ball 7 then ball 3, order matters and you are in permutation territory; if those two draws count as the same handful, order does not matter and you are counting subsets. This is exactly the ordered versus unordered split you already met. The second question is whether you put each ball back before the next draw. Replace it and the same ball can be drawn again — sampling with replacement; keep it out and every draw shrinks the pool — sampling without replacement. These two independent yes/no answers give four combinations, and that is the whole subject.

The 2x2 table of sampling formulas

Let us fill the four cells one at a time, each from a picture rather than memorized symbols. Ordered, with replacement is the easiest: each of the k draws independently has all n choices available, so by the multiplication principle the count is n^k. Think of a 3-digit PIN from 10 digits: 10 * 10 * 10 = 10^3 = 1000. Ordered, without replacement removes one option per draw: n choices, then n - 1, then n - 2, down k steps. That falling product is the permutation count P(n, k) = n! / (n - k)! from the previous guide.

Unordered, without replacement is the combination: you pick a set of k distinct items and do not care about the order you grabbed them in, which is C(n, k) = n! / (k! (n - k)!). It is exactly the ordered-without-replacement count with the k! internal orderings divided out, the move you made in the last guide. The fourth cell — unordered, with replacement — is the only genuinely new one, and it is the subtle one, so it gets its own section below. For now, just note that it counts multisets of size k drawn from n types, and its formula is C(n + k - 1, k).

Draw k items from n distinct items:

                      |  WITH replacement   |  WITHOUT replacement
  --------------------+---------------------+----------------------
  ORDER matters       |       n^k           |   n! / (n - k)!   = P(n,k)
  ORDER ignored       |   C(n+k-1, k)       |   n! / (k!(n-k)!) = C(n,k)

With n = 5, k = 3:
  ordered, with    : 5^3                 = 125
  ordered, without : 5*4*3               = 60
  unordered, w/o   : C(5,3)              = 10
  unordered, with  : C(5+3-1, 3) = C(7,3)= 35

The four sampling cells, with one worked count each for n = 5, k = 3.

The tricky cell: unordered with replacement, and stars-and-bars

Why does unordered-with-replacement need a special formula, and why is it C(n + k - 1, k) rather than something tidier? The trap is to think you can just divide n^k by k! the way combinations divided out orderings. That fails, because the k! correction only works when all k drawn items are distinct. When repeats are allowed, a draw like (red, red, blue) has fewer than 3! distinct orderings, while (red, blue, green) has the full 3! — so different selections get over-counted by different amounts, and a single clean division cannot undo it.

The honest fix is to recount the right objects directly. An unordered sample with replacement is really just a tally: how many of type 1, how many of type 2, on up to type n, with the counts adding to k. The elegant way to count those tallies is stars and bars. Lay down k identical stars for the items you drew, and n - 1 bars to slice them into n bins (one bin per type). For example, with n = 3 types {A, B, C} and k = 4 draws, the pattern "* * | | * *" reads as 2 A's, 0 B's, 2 C's. Every arrangement of the k stars and n - 1 bars is exactly one valid tally, and vice versa.

Now the count is easy, because we are back to a plain combination. We have k + (n - 1) symbols in a row and must choose which k of those positions are stars (the rest are bars): that is C(n + k - 1, k). This stars-and-bars argument is one of the most reusable tricks in counting — it answers "how many ways to write k as an ordered sum of n non-negative whole numbers", which is the same question dressed in algebra rather than urns. For our n = 3, k = 4 example it gives C(6, 4) = 15 distinct tallies.

From counts to probabilities, and why replacement changes everything

Sampling is not just a counting exercise; the with/without choice is the dividing line between two whole families of probability models, so it is worth feeling the difference. Sampling with replacement makes the draws independent and identically distributed: the urn looks the same before every draw, so the chance of red is the same every time and past draws carry no information. Sampling without replacement makes the draws dependent: drawing a red ball lowers the chance the next is red, because the urn has changed. That single distinction is the seed of the binomial (with replacement) versus the hypergeometric (without) distributions you will meet later.

A vivid reminder that with-replacement intuition can mislead is the birthday problem. Assigning birthdays to people is sampling 365 days *with* replacement (two people can share a day), and the surprise is how fast a repeat becomes likely: with just 23 people the chance that some pair shares a birthday already exceeds one half. The cleanest route is complementary counting — compute the probability that all 23 birthdays are distinct, which is the *without*-replacement count P(365, 23) divided by the *with*-replacement total 365^23, and subtract from 1. That single problem uses three of our four cells at once, which is exactly why the 2x2 frame is worth internalizing.

A worked walkthrough: choosing the right cell

Suppose a frozen-yogurt shop has 6 flavors and you fill a cup with 3 scoops. How many different cups are possible? The whole answer hinges on the two questions — and the wording forces specific answers. A cup does not record the order you added scoops, so order does *not* matter; and you are allowed to repeat a flavor (three scoops of mango is fine), so this is *with* replacement. That lands us squarely in the unordered-with-replacement cell.

Answer question one: does order matter? No — a cup of {vanilla, mango, mango} is the same cup however you scooped it.
Answer question two: can a flavor repeat? Yes — you may take the same flavor more than once. So: unordered, with replacement.
Apply the matching formula with n = 6 types and k = 3 scoops: C(n + k - 1, k) = C(6 + 3 - 1, 3) = C(8, 3) = 56 possible cups.

Now feel how the answer would change if the wording changed, since this is where the discipline pays off. If the scoops were stacked on a cone and the order top-to-bottom mattered but repeats were still allowed, you would switch to the ordered-with-replacement cell: 6^3 = 216. If instead the shop forbade repeating a flavor and order still did not matter, you would use the unordered-without-replacement cell: C(6, 3) = 20. Same six flavors, same three scoops — but four different real-world rules give four different counts. The formula is the easy part; correctly reading which cell you are in is the skill, and it is exactly the toolkit the next guide builds on when counting gets genuinely hard.