ConceptsApp

Base rates

Numeracy · 0 connections

Base rates

How often the outcome occurs across cases like this is the first number; the particulars of any one case can adjust it, not replace it.

Near-universal rule

Explanation

Most cabs in the city are Green; very few are Blue. A witness says the cab that fled the accident was Blue, and tests show the witness is right about four out of five times. Most people say the cab is probably Blue. It is considerably less certain than that — because the fleet is mostly Green, even a reliable "Blue" call is more likely to be a Green cab than not. The number most people leave out is how many Blue cabs there are. That frequency — how often the outcome occurs across the whole class of similar cases — is the base rate. The witness's reliability adjusts it; it does not replace it. Multiply the base rate by the weight of the evidence; do not substitute the evidence for the base rate. This is where judgment consistently goes wrong: a vivid, matching detail feels like evidence that the outcome is probable. It is not. The rare thing stays rare no matter how well this case resembles it. Feeling representative and being frequent are not the same. Setting the specifics aside and asking "how often does this outcome happen across all similar cases?" is the outside view. Most of the error in judgment comes from skipping that question.

Watch

No video has yet earned its place here.

When it applies

**You are forecasting from a narrative.** The most reliable forecasts start from the outside view — how often outcomes like this have occurred across similar cases — and adjust by what genuinely distinguishes this one. Building from the case's own specifics first, without anchoring on the class frequency, produces overconfident forecasts. **Your expertise could be working against you.** Research on expert forecasts in uncertain domains shows accuracy near chance for predictions beyond narrow technical claims. Domain knowledge generates a confident inside view; the base rate disciplines it. **You are reading a test result or diagnostic signal.** A positive result means almost nothing without knowing how common the condition is before the test. Establish that frequency first. When the condition is rare, the false positives vastly outnumber the true ones — regardless of how accurate the test is. **One coherent story is doing all the work.** When a single internally consistent scenario makes an outcome feel inevitable, that narrative is crowding out the base rate of how such situations actually resolve. **You are pricing or projecting from a group.** Population frequencies — historical loss rates, default rates — set the starting point. The individual case adjusts from there; the group frequency does not disappear because the case feels distinctive.

Where it stops

**The error is more conditional than it appears.** People discount base rates that seem to have no causal connection to the outcome in question and weight those that do. When a frequency is merely background — statistically real but not part of what produces the result — treating it as binding may be overcorrection. The obligation to anchor is strongest when the base rate is causally connected to what you are measuring. **The format may be driving the error.** Restate the same problem in whole headcounts — "of every thousand people who take this test, how many test positive without being sick?" — and correct reasoning improves substantially. Much of what appears to be a stable human tendency to neglect base rates turns out to depend heavily on how the numbers are expressed. The practical fix is often changing the format, not the thinker. **There is no single objective base rate.** Every case belongs to many overlapping classes, each of which produces a different frequency. Nothing in the method picks the "correct" class — that selection is always a judgment. The algebra is sound; the number it operates on is not fixed until someone decides which class to count.

The misuse

The characteristic trap for a knower is at the one step the method leaves open: choosing the reference class. The procedure says anchor on the base rate, then adjust. What it does not specify is which class you draw that rate from. Someone who knows the outside view can follow every step — find a frequency, anchor, adjust — while quietly selecting the class that returns the rate they were already inclined toward. The narrower, kinder class makes the unflattering statistics disappear. The procedure is followed. The discipline it was meant to supply is not. The second failure is subtler: treating the act of invoking a base rate as the whole job, rather than the start of it. Saying "I've done the outside view" ends scrutiny. The class goes undefended; whether the rate has any causal bearing on the case goes unasked. The structure produces the feeling of rigour without the substance of it.

A worked example

A research team gave a short diagnostic problem to physicians, residents, and medical students — people trained to interpret test results. The problem: a disease occurs in 1 in every 1,000 people. A test for it is wrong 5% of the time in healthy people, and never misses someone who is actually sick. A patient's test comes back positive. What is the probability the patient has the disease? The most common answer was 95%. The correct answer is roughly 2%. Run the numbers on 1,000 people. One has the disease, and the test catches that one. Of the 999 who are healthy, about 50 will test positive anyway — false positives from the 5% error rate. So roughly 51 people receive a positive result, and exactly one of them is actually sick: close to 2%. The respondents fixed on the test's accuracy and read the positive result as near-certainty. The number that decided the correct answer was sitting in the problem the whole time: the disease occurs in only 1 in 1,000 people. A 5% false-positive rate fills about fifty false alarms for every true case when the condition is that rare. In a clinic, acting on the 95% figure rather than the 2% one is the difference between watchful waiting and ordering an invasive procedure on a probability that was wrong by a factor of nearly fifty.

Push

Start from how often the outcome has occurred across cases like this; only then let the specifics of your case move you, and only as far as they genuinely warrant.

Veto

Don't treat a confirming result as though it settles the probability — not before establishing how often the outcome occurs across cases like this.

Connects to

Connections appear as the deck grows.

Go deeper

Mastery question

**Question:** You are given a description of a person, and the more details you hear, the more perfectly the description matches your picture of some rare profession. Does each additional matching detail make it more reasonable to conclude they hold that profession — or less? Why? **The answer:** Less — or at best, no more. Each added detail can only narrow the set of people it describes; it tells you how closely the person resembles the stereotype, not how many such people exist. The rare profession stays rare no matter how vivid the fit. Accumulating matching detail raises the *feeling* of probability while leaving the actual probability governed by the base rate. The disciplined move is to ask "of everyone who fits this description, how many are in the rare profession versus the common one?" — and added detail does nothing to improve that ratio. **The answer that misses it:** "More reasonable — the tighter the description fits, the more confident the conclusion." The person who has only memorised the concept may even add "and of course you should consider base rates" while still treating each fitting detail as another unit of evidence for the conclusion. **Why the difference matters:** A person who has only memorised the concept holds it as a reminder: factor in base rates alongside the evidence. That framing still treats representativeness and frequency as the same currency, just with an extra term added in. A person who has internalised it holds the two apart: vividness of fit is not weak evidence of frequency — it is not evidence of frequency at all. The question can't be cleared by reciting the term, because the trap is precisely the instinct that more fit means more likely — the instinct the term is supposed to have dislodged.