As evidence accumulates, it may result in proving a hypothesis true or false, irrespective of prior odds. When the evidential tug of war has a winner, prior odds are no longer relevant. No matter our starting belief, we are 100% convinced that the sun will rise tomorrow. As four or more accurate coaches concur in calling a child a football champion, his father can be rightfully confident that, however unlikely at the start, the hypothesis that his child is a champion is very well supported.
Since evidence accumulates multiplicatively, the tug of war can also be won thanks to even a single piece of conclusive evidence annihilating all other evidence as well as any prior odds. As Tony opened the door, he proved himself conclusively guilty.
But evidence does not always lead to the truth. The tug of war does not always have a winner: it can remain stuck somewhere in the middle, where all we can say is that the hypothesis is probably true, and therefore also probably false. In that case, our beliefs continue to be influenced by our priors.
Dependence on prior beliefs is an inconvenient obstacle to the pursuit of truth. Ideally, we would like evidence to speak for itself, swamp priors and give us certainty. But when evidence is not so obliging, ignoring priors, or pretending they do not exist, is not the right course of action: it is the Prior Indifference Fallacy – the assumption, most often wrong, that the hypothesis under investigation has a 50% prior probability of being true.
Prior indifference is not only the fallacy of hopeful fathers, duped lovers and swayed investors. It is also the error made by statisticians who, by ignoring priors (i.e. setting BO=1), identify Posterior Odds with the Likelihood Ratio: PO=LR. As Bruno de Finetti put it:
Tracing it back to Bayes’s theorem, what goes wrong is that those who do not wish to use it in a legitimate way – on account of certain scruples – have no scruples at all about using it in a manifestly illegitimate way. That is to say, they ignore one of the factors (the prior probability) altogether, and treat the other (the likelihood) as though it in fact meant something other than it actually does. This is the same mistake as is made by someone who has scruples about measuring the arms of a balance (having only a tape-measure at his disposal, rather than a high precision instrument), but is willing to assert that the heavier load will always tilt the balance (thereby implicitly assuming, although without admitting it, that the arms are of equal length!). (Theory of Probability, Volume 2, p. 248).
The typical hypothesis of a statistical model is that some parameter has a certain value. The hypothesis is tested in the light of some evidence, consisting of a set of data. TPR=P(E|H) is the probability of the evidence in case H is true, i.e. in case the parameter has the specified value; and FPR=P(E|not H) is the probability of the evidence in case H is false. Hence the Likelihood Ratio LR=TPR/FPR measures how much more or less likely it is to observe the data in case the parameter has the specified value, compared to the case where it doesn’t.
Bayes’ Theorem says that the odds that H is true in the light of the evidence equal the Likelihood Ratio times the prior odds that H is true. Ignoring prior odds, and equating posterior odds to the Likelihood Ratio, is only appropriate if the accumulated evidence leads to the truth, i.e. if there is an overwhelming amount of confirmative or disconfirmative evidence that cumulatively proves H true or false: the parameter certainly has or certainly does not have the specified value.
This implicit expectation of convergence depends in turn on the assumption that the parameter is ‘out there’, reflecting an ‘objective’ feature of reality. In that case, irrespective of prior odds, it either has or does not have the specified value (assuming for simplicity that it can take one of a finite number of values. But with a few qualifications the same is true in the continuum). Therefore, tilting the balance one way or the other is only a matter of gathering enough data. Hence one might as well start from perfect ignorance and let evidence speak for itself.
But how much data is enough data? When does evidence become overwhelmingly confirmative or disconfirmative? Whatever our priors, we certainly have overwhelming evidence to prove tomorrow’s sunrise. However small the father’s initial priors, his child will almost certainly become a football champion if ten accurate coaches say so. But what if the father asks only one or two coaches? Or what if he asks ten coaches but they express divergent views, so that the product of their Likelihood Ratios is neither very large nor zero? In those circumstances the father’s priors do matter, and it does matter that they are, however approximately, correct. If he starts with very low priors, the father will correctly conclude that inadequately supported or unsettled views should leave him sceptical about the chances of his son’s success. But if, blinded by evidence, he neglects the Base Rate and becomes prior indifferent, his posterior odds will be grossly overstated.
Likewise, the available data may be scarce or ambiguous, and therefore insufficient for a precise estimate of the model’s parameter. In such a case, a correct prior probability assigned to the hypothesis that the parameter has the specified value is key to a proper evaluation of the probability that it actually does. And prior indifference can be as misleading. This works both ways: a low prior probability will require abundant and convergent evidence; but if the prior is high, less and rougher evidence may suffice.
A mistaken notion of the goals of scientific inquiry rejects prior dependence as subjective and therefore ‘unscientific’. But there is really nothing more to it than Laplace’s dictum: Extraordinary claims require extraordinary evidence. Whence its corollary: Ordinary claims require ordinary evidence. We are readily disposed to believe that Uri Geller can bend spoons with his hands; but when he says he does it with his mind we want to look a bit closer.
Besides, parameters are not always ‘out there’: they often are just an attribute of our representation of reality. This is definitely the case in economics: there is no such a thing as the Marginal Propensity to Consume, the Coefficient of Relative Risk Aversion, or the Weighted Average Cost of Capital. So the expectation that, given enough data, we can certainly discover their true value is not warranted. And the probability of the hypothesis that some parameter has a specified value may not necessarily converge to one of the two boundaries of the probability spectrum, but may stay in the middle and, as such, remain dependent on our priors.
This is not an inconvenience: it is the natural state of scientific inquiry, whose ethos is to be comfortable with uncertainty and remain open to evidence-led reversals of any established truth. Examples abound. Just to pick one of the latest:
After 35 years as dietary gospel, the idea that saturated fat is bad for your heart appears to be melting away like a lump of butter in a hot pan.
Prior dependence is inherent to science, not alien to it. Ignoring it is tempting but wrong. Prior indifference does not eliminate priors: as Irving J. Good put it, it just SUTC them: Sweep Under The Carpet (Good Thinking, p. 23).
Rather than pretending they don’t exist, statisticians should try to get their priors right.