(Incidentally, hard evidence is not the same as scientific evidence: a great deal of empirical observation is soft. Astronomers do not run experiments with stars, or meteorologists with clouds).
Evidence comes in different shapes. Any sign that can be related to a hypothesis is a form of evidence about the hypothesis. A sign may be a simple description, such as:
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
Question: Is Linda:
- A bank employee
- A Greenpeace supporter.
This is a slightly modified version of another well-known Kahneman and Tversky experiment (see Chapter 15 of Thinking, Fast and Slow). The description is not accurate enough to answer the question with certainty. So we have to go with the most probable choice: is Linda more likely to be a bank employee or a Greenpeace supporter? Let’s see:
Hypothesis: Linda is a bank employee/Greenpeace supporter. Evidence: Linda’s description (let’s call it E).
The problem is best looked at in odds form:
where 1 is ‘bank employee’ and 2 is ‘Greenpeace supporter’.
- What are the prior odds that Linda is a bank employee/Greenpeace supporter? It is not obvious in absolute terms, but surely BO1>BO2: there are many more bank employees than Greenpeace supporters. Let’s call K the ratio of the prior odds of Greenpeace supporters and bank employees: K=BO2/BO1. Without a description, Linda would clearly be 1/K times more likely to be a bank employee than a Greenpeace supporter: PO1=PO2/K.
- How accurate a portrait of a bank employee/Greenpeace supporter is E? Again, it is not easy to say in absolute terms, but surely Linda looks much more like a Greenpeace supporter than a bank employee: LR2>LR1. Let’s also say that E is totally unconfirmative as a description of a bank employee: LR1=1. Finally, for simplicity let’s assume symmetry, so that accuracy A=TPR and LR=A/(1-A). Then we have:
Therefore, given E, the odds that Linda is a Greenpeace supporter are greater that the odds that she is a bank employee if A2>1/(1+K).
For example, if K=10%, then Linda is more likely to be a Greenpeace supporter if A2>91%. If K=1% the required A2 is 99% and if K=20% it is 83%. In any case, the required level of accuracy is very high. The accuracy of a Greenpeace supporter description can go from 0 (“Linda is an avid game hunter and ivory collector”) to 1 (“Linda is the captain of the Rainbow Warrior”), passing through the totally unconfirmative 0.5 (“Linda is blonde and likes chocolate”). E is plausibly more than 50% accurate as a description of a Greenpeace supporter, but it is unlikely to be as high as 80%. Hence the right conclusion, according to Bayes’ Theorem, is that Linda is more likely to be a bank employee than a Greenpeace supporter.
But this is not what most people think. The most common answer is that, given E, Linda is more likely to be a Greenpeace supporter. The reason, once again, is the Prior Indifference Fallacy. Under prior indifference, K=1, hence the required A2 falls down to 50%: Linda is more likely to be a Greenpeace supporter than a bank employee if she is simply more likely than not to be a Greenpeace supporter.
Consider now a slight variation. Question: Given description E, is Linda:
- A bank employee
- A bank employee who is also a Greenpeace supporter.
The problem is essentially the same. Again K<1, this time not only statistically but logically: 2 must be a subset of 1. Also, LR2>1: Linda looks more like a bank employee and Greenpeace supporter than like a simple bank employee. Hence we can draw the same conclusion: according to Bayes’ Theorem, Linda is more likely to be a bank employee, unless E is a very accurate description of a bank employee cum Greenpeace supporter.
Again, experimental evidence shows that most people think 2 is more likely than 1. Kahneman and Tversky call it the Conjunction Fallacy, referring to the impossibility that K>1 and implying that, therefore, PO1 must be bigger than PO2. However, as we have seen, that is not necessarily the case: there can be sufficiently accurate descriptions of Linda, such that it is rational to conclude that 2 is more likely than 1, despite a lower Base Rate (for example: “Linda is a bond trader who devotes her entire annual bonus to environmental causes”).
Kahneman and Tversky called this cognitive bias Representativeness. Linda is judged to be more likely a Greenpeace supporter than a bank employee because her description is more representative of the former than of the latter. In simpler words, Linda looks more like a typical Greenpeace supporter than a typical bank employee. Such evidence obfuscates the prevalence of bank employees over Greenpeace supporters in the general population which, in the absence of a description, would naturally imply the opposite probability ranking.
I call this prior indifference because it gets to the crux of the matter: the Inverse Fallacy. People confuse the probability of the hypothesis, given the evidence, with the probability of the evidence, given the hypothesis. And they do so because they assume the hypothesis is equally likely to be true or false.
Prior indifference also explains probability judgements in response to neutral, unconfirmative evidence. For instance, faced with a totally unrepresentative description of Linda (e.g. “Linda is blonde and likes chocolate”), the right conclusion, according to Bayes’ Theorem, would be to stick to the Base Rate. LR=1 implies PO=BO: neutral evidence is the same as no evidence. But this is not what happens empirically. Given an irrelevant description, people tend to assign the same probability to Linda being a bank employee or a Greenpeace supporter, just as they assign 50% support to the predictions of a useless coin-tossing expert. They are prey to the Prior Indifference Fallacy.
Why do I go on and on about this? Because I think it is hugely important. We use Linda, cab drivers, child footballers and other simple examples to describe and analyse it, but the Prior Indifference Fallacy is much more consequential. It helps to explain why people make probability misjudgements and, through them, come to believe weird things. What is racism if not the confusion between the probability that somebody is a criminal, given that he is black, with the probability that he is black, given that he is a criminal?