We examine evidence to test hypotheses about events. Observation, experience, trial-and-error help us update our prior beliefs. My four-year old child is now an expert at putting on shoes.
Events can be more or less regular. Through repeated experiment, the probability of regular events tends to converge to certain values. The probability that the sun will rise tomorrow is 100%. The probability that a fair coin will land on Head is 50%. But the probability of irregular events is much more difficult to pin down. What is the probability that it will rain tomorrow? Although we can observe a huge amount of data, we don’t know. Data can certainly be used to calculate a Base Rate: just check how many times it rained in the last N days. As N increases, the Base Rate converges to a certain value. But how useful is that number in order to know the probability that it will rain tomorrow? That probability depends on a vast and vague amount of factors: location, time of the year, temperature, humidity, wind, yesterday’s weather, and so on. Imagine you are asked the question: On the basis of all the available information, will it rain tomorrow? You can only answer Yes or No, not because you are sure that it will or will not rain, but because it will surely either rain or not. So, answering Yes or No helps to determine your accuracy as a weather forecaster. You do this for, say, 2000 days. After almost five and a half years (good luck to you), your answers can be catalogued as in Table 1:
Overall, it rained 600 out of 2000 days: the Base Rate was 30%. You predicted rain half of the times. Of those, 330 times it did rain and 670 times it didn’t. When you said it would not rain, 270 times it did and 730 times it didn’t. Using our notation, you had TPR=55% (=330/600), FPR=48%, FNR=45% and TNR=52%. Your overall accuracy was 54%. Your forecast provided some improvement over the Base Rate – the Posterior Probability increased to 33% (=330/1000) – but, as you might have suspected, as a weather forecaster you were not very different from a coin toss.
Imagine, however, that your objective was not to guess tomorrow’s weather to the best of your knowledge and ability, but to look like an accurate rain forecaster. You want to claim: “In the last five and half years, I correctly called 95% of all rainy days”. What you could do is very simple: call a rainy day most of the times. For example, if you do so 9 out 10 days (leaving out the hottest, sunniest periods, as to avoid sounding crazy), your answers may look like Table 2:
Your TPR would be an impressive 95%: 570 correctly forecasted rainy days out of 600. The catch is, of course, that the other 1230 times that you said it would rain, it didn’t: your FPR would be a catastrophic 88%. Hence your overall accuracy would still be 54% and your Posterior Probability would actually fall to 32%.
But people give more weight to True Positives than to False Positives: they are worried about getting wet more than they mind carrying along an unnecessary umbrella. High rain aversion results in a big reward for True Positives and a mild penalty for False Positives. This is a form of Confirmation Bias.
The Confirmation Bias gives experts an incentive to increase their TPR. In our example, the Base Rate is 30% and, given low accuracy, the Posterior Probability is only a little higher at 33%. Prior Indifference (BR=50%) would boost it, but barely above a coin toss, to 53%. Increasing TPR to 95%, and consequently FPR to 88%, would even lower the Posterior Probability to 32% and, under prior indifference, to 52%. However, by keeping the limelight on the high TPR and obfuscating the high FPR, the Confirmation Bias gives experts a stronger incentive to be overconfident.
Overconfidence is the difference between an artificially high TPR and the true TPR which would result from an honest prediction effort. It pays to be overconfident if the gain from a higher TPR exceeds the loss from a higher FPR.
Our weather forecaster is like Dr. Doom: he wants to be perceived as an accurate predictor of rainy days. But he has an obvious disadvantage: his trick can be easily spotted. After a few False Alarms, his credibility will rapidly fade. Dr. Doom, on the other hand, has a whole bag of tricks up his sleeve. When his call for a market downturn turns out to be false, he can always push it forwards and, when a downturn finally arrives, vindicate his prediction. He can also appeal to prudence: it is “better to be safe than sorry”. And he can cultivate his credibility by trumpeting True Positives with fanfare and quietly brushing False Positives under the rug.