How to make good guesses
‘Would you say that someone reading the FT is more likely to have a PhD or to have no college degree at all?’
What’s the likelihood that the British economy will fall into recession this year? Well, I’ve no idea — but I have a new way to guess.
The obvious response is that the FT reader has a PhD. Surely people with PhDs better exemplify the FT reader than people with no degree at all, at least on average — they tend to read more and to be more prosperous.
But the obvious response is too hasty. First, we should ask how many people have PhDs and how many people have no college degree at all? In the UK, more than 75 per cent of adults have no degree but the chance that a randomly chosen person has a PhD is probably less than 1 per cent.
It only takes a small proportion of non-graduates to read the FT before they’ll outnumber the PhD readers. This fact should loom large in our guess, but it does not.
Logically, one should combine the two pieces of information, the fact that PhDs are rare with the fact that FT readers tend to be well educated. There is a mathematical rule for doing this perfectly (it’s called Bayes’ rule) but numerous psychological experiments suggest that it never occurs to most of us to try. It’s not that we combine the two pieces of information imperfectly; it’s that we ignore one of them completely.
The number that gets ignored (in this example, the rarity of PhDs) is called the “base rate”, and the fallacy I’ve described, base rate neglect, has been known to psychologists since the 1950s.
Why does it happen? The fathers of behavioural economics, Daniel Kahneman and Amos Tversky, argued that people judge such questions by their representativeness: the FT reader seems more representative of PhDs than of non-graduates. Tversky’s student, Maya Bar-Hillel, hypothesised that people seize on the most relevant piece of information: the sighting of the FT seems relevant, the base rate does not. Social psychologists Richard Nisbett and Eugene Borgida have suggested that the base rate seems “pallid and abstract”, and is discarded in favour of the vivid image of a person reading the pink ’un. But whether the explanation is representativeness, relevance, vividness or something else, we often ignore base rates, and we shouldn’t.
At a recent Financial Times event, psychologist and forecasting expert Philip Tetlock explained that good forecasters pay close attention to base rates. Whether one is forecasting whether a marriage will last, or a dictator will be toppled, or a company will go bankrupt, Tetlock argues that it’s a good idea to start with the base rate. How many marriages last? How many dictators are toppled? How many companies go bankrupt? Of course, one may have excellent reasons to depart from the base rate as a forecast but the base rate should be the beginning of the journey.
On this basis, my guess is that there is a 10 per cent chance that the UK will begin a recession in 2016. How so? Simple: in the past 70 years there have been seven recessions, so the base rate is 10 per cent.
Base rates are not just a forecasting aid. They’re vital in clearly understanding and communicating all manner of risks. We routinely hear claims of the form that eating two rashers of bacon a day raises the risk of bowel cancer by 18 per cent. But without a base rate (how common is bowel cancer?) this information is not very useful. As it happens, in the UK, bowel cancer affects six out of 100 people; a bacon-rich diet would cause one additional case of bowel cancer per 100 people.
Thinking about base rates is particularly important when we’re considering screening programmes or other diagnostic tests, including DNA tests for criminal cases.
Imagine a blood test for a dangerous disease that is 75 per cent accurate: if an infected person takes the test, it will detect the infection 75 per cent of the time but it will also give a false positive 25 per cent of the time for an uninfected person. Now, let’s say that a random person takes the test and seems to be infected. What is the chance that he really does have the disease?
The intuitive answer is 75 per cent. But the correct answer is: we don’t know, because we don’t know the base rate.
Once we know the base rate we can express the problem intuitively and solve it. Let’s say 100 people are tested and four of them are actually infected. Then three will have a (correct) positive test, but of the 96 uninfected people, 24 (25 per cent) will have a false positive test. Most of the positive test results, then, are false.
It’s easy to leap to conclusions about probability, but we should all form the habit of taking a step back instead. We should try to find out the base rate, or at least to guess what it might be. Without it, we’re building our analysis on empty foundations.