I put together a shortened version of the Blinded by Evidence paper:

Heuristic Biases and Knightian Uncertainty: The Prior Indifference Fallacy.

The shorter paper concentrates on a single point: seemingly unrelated heuristic biases – Representativeness, Anchoring, Availability and Hindsight – can be explained by a common underlying bias: the Prior Indifference Fallacy. Prior Indifference can in turn be seen as deriving from Knightian uncertainty. The main difference among the four biases is in the type of evidence used to update beliefs.

Comments welcome.

A legal trial is a test of the hypothesis of Guilt. A judge examines evidence to evaluate the probability that the defendant is guilty and decides to convict him if the probability is high enough, or to acquit it if it isn’t. How high the probability of Guilt needs to be for a conviction depends on the standard of proof, which is proportional to the gravity of the allegation and the corresponding severity of the punishment.

But what determines the standard of proof? Let’s see. The judge has a utility function, defined over two possible states: Guilt or Innocence, and two possible decisions: Convict or Acquit.

The judge draws positive utility U(CG) from convicting a guilty defendant and negative utility U(CI) from convicting an innocent one. And he draws positive utility U(AI) from acquitting an innocent defendant and negative utility U(AG) from acquitting a guilty one. Based on these preferences, the probability of Guilt that leaves the judge indifferent between conviction and acquittal is given by:

(1)

Hence:

(2)

(This comes from a paper by Terry Connolly in this collection).

The judge will convict if the probability of Guilt is higher than P and acquit if it is lower. In order to examine the properties of P, we define BB=U(CI)/U(AG), CB=U(AI)/U(CG) and DB=-U(AG)/U(CG). We call BB the Blackstone Bias, after Sir William Blackstone’s Principle: “It is better that ten guilty persons escape than that one innocent suffer”. B>1 means that the pain of a wrongful conviction – a False Positive – is higher than the pain of a wrongful acquittal – a False Negative. Similarly, we call CB the Compassion Bias, where C>1 means that the judge draws more pleasure from a rightful acquittal – a True Negative – than from a rightful conviction – a True Positive. Finally, we can call DB the Distress Bias, where D>1 means that the pain of a wrongful acquittal – a False Negative – is higher than the pleasure of a rightful conviction – a True Positive. Using these definitions, (2) can be rewritten as:

(3)

where P is a function of the three biases and is independent of the utility function’s metric.

Assume first that the judge has no biases: BB=CB=DB=1. In this case, P=50%: conviction requires a Preponderance of evidence. An unbiased judge convicts if the defendant is more likely to be guilty than innocent. This may be an acceptable verdict for minor charges, where the limited size of the penalty renders the judge indifferent between False Positives and False Negatives and between True Positives and True Negatives. As the severity of the punishment increases, however, a conscientious judge will start caring more about avoiding a wrongful conviction than a wrongful acquittal. In this case, assuming for example the Blackstone Principle (BB=10), P increases to 85%: in order to convict, the judge will require Clear and convincing evidence. The same happens if we increase CB to 10, i.e. the judge cares more about reaching a rightful acquittal than a rightful conviction. If both BB and CB are increased to 10, P increases to 91%, thus entering the Beyond reasonable doubt zone. Notice that, if BB=CB, then (3) reduces to P=BB/(1+BB), which is 50% for BB=1, 91% for BB=10 and tends to 1 as BB grows further (P=99% requires BB=99). Hence, if BB=CB, increasing DB has no effect on P: as long as the judge is indifferent between the two ways of being wrong and the two ways of being right, his attitude towards guilt does not matter. DB affects P only if BB≠CB. If, for example, BB=10 and CB=1, then increasing DB from 1 to 10 increases P from 85% to 90%. If, on the other hand, BB=1 and CB=10, then the DB increase brings P down to 65%. This makes sense: a higher DB increases the sensitivity to wrongful acquittals and decreases the sensitivity to rightful convictions.

What happens with ‘perverse’ biases, i.e. lower than 1? For example, we can call BB=0.1 the Bismarck Bias: “It is better that ten innocent persons suffer than that one guilty escapes”. In this case, unsurprisingly, P decreases to 35%: the judge requires just about one-third probability of Guilt in order to convict. The same happens if CB=0.1 – which can be called the Callousness Bias. And if BB and CB are both 0.1, P goes all the way down to 9% – unpleasant news for defendants.

Notice that perversion is not about the signs of the utility function: U(CI) and U(AG) are still negative and U(CG) and U(AI) still positive. A perverse judge is not one who draws pleasure from wrongful verdicts and pain from rightful ones. Perversion is about relative utilities: U(CI)<U(AG) – the judge would rather convict an innocent person than acquit a guilty one – and U(AI)<U(CG) – he prefers to convict a guilty person to acquitting an innocent one. Compared to sign reversals, these may appear as secondary perversions. But they are all it is needed to bring havoc to the standard of proof.

In civilised legal systems, the standard of proof is inspired by worthy principles, aimed at safeguarding the rights of the innocent, especially as the severity of punishment increases. Uncivilised systems are characterized by the opposite tendency: a higher focus of hitting the guilty, combined with a lower concern for ‘collateral damage’. In reality, however, the distinction is not as neat: perverse utility functions exist also in advanced democracies, especially where judges have strong incentives to convict and collateral damage is less manifest.

A perverse verdict is the result of a bad decision process where, helped by hindsight, the judge imposes a high cost of False Positives on others, in order to avoid the cost of a Miss on himself. Do you want to catch the thief? Shoot everyone in sight. Do you want a culprit to blame for a negative outcome? Accuse him of negligence – how could he possibly ignore it? He should have known better.

Not as bloody in practice, but as uncivilised in principle.

The Principle of Sufficient Reason and its corollary, the Hindsight Bias, are the source of many of what David H. Fischer called Historians’ Fallacies.

If everything that ever happened was destined to do so in the only possible way, according to predetermined causes, history is just a chronological narration of necessary events unfolding towards an inexorable present. As crazy as this sounds, blatant hindsight has been and still is part and parcel of much historical analysis. In explaining the past, historians are faced with a strong impulse to presume the inevitability of what happened, and to see it in the light of what was then an unknown future.

Was Nazi Germany destined to lose World War II? Of course not. But it did, and the historian’s mission is to provide a satisfactory explanation of why. But what is a satisfactory explanation? As we know, satisfaction is the eyes of the beholder: children are satisfied that Santa Claus comes through the chimney because he is magic, in the same way as the ancient Greeks attributed natural phenomena to the might of some god. The historians’ equivalents of magical explanations are apparently more sophisticated, but essentially as naïve. They rest on a more or less explicit assumption that history is governed by some underlying force, which, like gravity driving a ball to the floor, leads the past towards the present. The assumption has a long and varied genealogy, culminating in Hegel’s delirious fantasies:

The only thought which Philosophy brings with it to the contemplation of History is the simple conception of Reason; that Reason is the Sovereign of the World; that the history of the world, therefore, presents us with a rational process. This conviction and intuition is a hypothesis in the domain of history as such. In that of Philosophy it is no hypothesis. It is there proved by speculative cognition, that Reason – and this term may here suffice us, without investigating the relation sustained by the Universe to the Divine Being – is Substance, as well Infinite Power; its own Infinite Material underlying all the natural and spiritual life which it originates, as also the Infinite Form – that which sets this Material in motion. (The Philosophy of History, p. 9).

Camouflaged in Hegel’s verbal acrobatics is the disarmingly vacuous notion that history has a predestined direction, a higher purpose that, unsurprisingly, ends up coinciding with one’s own ideals – the Prussian state in Hegel’s case, or communism for the ‘upturned Hegelian’ Karl Marx, or, to take a more recent example, liberal democracy for Francis Fukuyama, author of the incredibly unwisely titled The End of History and the Last Man.

Obviously, history has no purpose. There is no such thing as historical necessity, no grand law of historical progress enabling us to explain the past and foretell the future. The point has been made, forcefully and definitively, by Karl Popper in The Poverty of Historicism. This doesn’t mean that ‘history is just one damn thing after another’ (as Arnold Toynbee, another visionary inventor of universal historical laws, protested it wasn’t – according to an overly cited but untraceable quote. Another case for the Quote Investigator). As in any other field of enquiry, historians ask historical questions and build explanations based on historical evidence. They don’t just recount what happened: they explain why, linking a countless number of events in a coherent causal chain. But causes can only be seen after the event. Before it happens, an event is only one of several possibilities, each of which can happen with some probability. Contrary to the Principle of Sufficient Reason, what happened was not bound to do so and its cause was not the only necessary explanation. Causes are only as predictable as the events they explain. They appear as predictable or even perfect evidence only after the fact.

‘What happened’ is the epitome of soft evidence: unrepeatable, uncontrollable, unique. But we can still learn from it, provided that we allow for the possibility that historical reality might have been different from what turned out to be the case. Hugh Trevor-Roper thus made the point:

At any given moment in history there are real alternatives, and to dismiss them as unreal because they were not realized is to take the reality out of the situation. How can we ‘explain what happened and why‘ if we only look at what happened and never considered the alternatives, the total pattern of forces whose pressure created the event? It is only if we place ourselves before the alternatives of the past, as of the present, only if we live for a moment, as the men of the time lived, in its still fluid context and among its still unresolved problems, if we see those problems coming upon us, as well as look back on them after they have gone away, that we can draw useful lessons from history. (History and Imagination, p. 363).

In his most famous book, The Last Days of Hitler, Trevor-Roper proved that Hitler had died in his bunker in Berlin and was not alive and living in the West, as claimed by Soviet propaganda. But what if Hitler had manage to escape? And what if Germany had invaded Britain in May 1940? What if it had defeated the Soviet Union in 1941? As well argued by Niall Ferguson in the Introduction to Virtual History, these are not idle questions (the last two are the subject of Andrew Roberts’ and Michael Burleigh’s essays in the collection).

Of course, what-ifs can be more or less intentionally silly: ‘If Cleopatra’s nose had been shorter, the whole face of the earth would have changed’, Pascal noted in his Pensées (392). And for want of a nail … the kingdom was lost. But if the question is plausible, and the alternative it poses is one of the relevant possibilities that contemporaries actually faced at the time, proper counterfactual history can be highly instructive. Whereas seeing history as the ineluctable product of necessary causes can leave us exposed to a resigned and undiscerning acceptance of whatever happens. In Baruch Fischhoff‘s words:

When we attempt to understand past events, we implicitly test the hypotheses or rules we use both to interpret and to anticipate the world around us. If, in hindsight, we systematically underestimate the surprises that the past held and holds for us, we are subjecting those hypotheses to inordinately weak tests and, presumably, finding little reason to change them. Thus, the very outcome knowledge which gives us the feeling that we understand what the past was all about may prevent us from learning anything from it.

What if the Nazis had won the war? Let’s venture to imagine the winners of this year’s Goebbels Music Awards:

We have seen how the Prior Indifference Fallacy underlies three well-documented cognitive biases: Representativeness, Anchoring and Availability. A fourth one that can be similarly interpreted is the Hindsight Bias: the tendency to regard events as predictable after they have occurred (see Chapter 19 of Thinking, Fast and Slow).

How could US intelligence fail to prevent 9/11? How could the Federal Reserve fail to detect the US housing bubble? How could the SEC fail to spot Bernard Madoff?

As events happen, they require explanations. We want to know why they happened. Explanations make sense of events, linking them to preceding events in a causal chain that gives us a satisfactory account of what turned out to be the case. But causes can only be seen after the events. Before events happen, all we can see are other events – evidence whose links to what happened was only probable. 9/11 was the result of a countless number of preceding events, none of which was bound to happen for certain. High house prices were not destined to cause the 2008 recession. As hard as it is to believe after the fact, Madoff did not look like an obvious fraud.

Nothing that happens is bound to do so. Everything is the result of a long chain of more or less probable events. As common sense as this is, it runs counter to the Principle of Sufficient Reason, according to which there is no such thing as chance: everything is destined to occur in the only possible way, according to its causes. The Hindsight Bias is a corollary of the Principle of Sufficient Reason.

Let’s take Madoff. A few years before the scandal broke out, I was having dinner with a friend who, until a few months earlier, had been the Italian private banking head of a large American firm. He told me he had quit his job and was now working for one of the largest feeder funds of Madoff Investment Securities. Who?, I said, as he embarked in an enthusing description of the split-strike conversion strategy that had allowed Madoff to earn 15% returns year after year, with little volatility and no management fee. “I know what you’re thinking” – he concluded, as I stared at him with a are-you-out-of-your-mind look – “it can’t be true. But it is. It is one of the largest broker-dealer firms on Wall Street and Madoff is one of the best respected hedge fund managers”.

So let’s go back a few years and test the hypothesis ‘Madoff is a crook’. As we know, PO=LR∙BO: the Posterior Odds of the hypothesis equal the Likelihood Ratio of the evidence times the Prior Odds of the hypothesis. In this case, our evidence is the split-strike conversion strategy. Let’s take a very sceptical view of it and say TPR=100%: the probability that Madoff would use that strategy, given that he is a crook, is 100%; and FPR=5%: the probability that he would use the strategy, given that he is not a crook, is only 5%. Hence LR=20: the evidence is highly confirmative of the hypothesis that Madoff is a crook. But in order to measure the probability that Madoff is a crook, given that he uses the strategy, we need to multiply LR by BO: the prior probability that Madoff is a crook. In my perfect ignorance – I didn’t know who he was – I had BO=1, which gave me PO=20 and therefore PP=95%: Madoff was almost certainly a crook – hence my bewilderment. But for my friend – along with thousands of wealthy investors and sophisticated advisors – the prior probability that Madoff was a crook was very small: let’s say one in a thousand. We know these numbers: they are the same as in our child footballer story. According to my friend, then, the probability that Madoff was a crook, in the light of his investment strategy, was only 2%. In fact, it was probably much less than that, given that my friend would have chosen a much higher FPR. With FPR=20%, for example, LR=5 and PP=0.5%. In that case, even after increasing BR to a more circumspect 1%, PP would still have been less than 5%.

This is not to justify my friend’s or anybody else’s gullibility. But to conclude that they were all utter dunces or, worse, that the feeders ‘could not have possibly ignored’ who Madoff was, and were therefore in cahoots with him, is wrong. The mistake is caused by a Hindsight Bias: once events happen – Madoff’s fraud is discovered – we tend to ignore the state of knowledge on which prior beliefs were formed. Once we find out that Madoff was a crook, we forget that he was a highly respected professional, and mistakenly conclude that his dishonesty was highly predictable. This is a backward Prior Indifference Fallacy: blinded by the evidence of our discovery, we inadvertently shift our and everybody else’s past priors to 50%. In Madoff’s case, these would have been much better priors. But we can only say so with the benefit of hindsight.

In addition, hindsight makes evidence appear more accurate than it was before the event. As we have seen, starting from a low prior of dishonesty, even a very sceptical view of the split-strike conversion strategy was not enough to conclude that Madoff was a crook. After the event, however, we tend to regard the same evidence as conclusive, and retrospectively drop FPR all the way to zero: there was no way that Madoff would have used that strategy if he were not a crook. It can indeed be argued that a closer look at Madoff’s strategy should have convinced anyone that its FPR was virtually 0%: there was near-perfect evidence that Madoff was dishonest, irrespective of his outstanding reputation. And there is no denying that the prospect of hefty returns and advisory fees made some people’s scrutiny not as diligent as it should have been. But for most people this became clear only with the benefit of hindsight.

The Hindsight Bias follows from the Principle of Sufficient Reason: everything that happens was bound to do so, according to its causes. So, as causes become clear after the event, we erroneously infer that they were as clear before the event, i.e. that there was conclusive evidence that the event was certainly going to happen. Hence the question: Why didn’t we see it? Or rather: Why didn’t they see it – those who were supposed to know: the controllers, the experts, the advisors? The hindsight answer is: because they were negligent, incompetent, irresponsible. Or worse: they knew it all along – how could they possibly ignore it? – and did nothing.

Backward prior indifference, combined with the spurious accuracy of retrospective evidence, make hindsight a particularly powerful bias. This is bad news for decision makers. No matter how well designed their decision process might be, the occurrence of a bad outcome – always a possibility in risky conditions – may be taken as a proof that the process was not well designed.

This might be true: a bad outcome may reveal a flaw in the process – Madoff’s case is a perfect example. But it is wrong to conclude that a process is badly designed because a bad outcome occurred. A good process needs to balance risk reduction with its associated costs. A process aimed at entirely eliminating risk irrespective of costs is not a well-designed one.

Bad processes are easy to design. You want to eliminate road accidents? Impose a 30kph speed limit. You want to eliminate airport threats? Give each passenger a one-hour check. You want to avoid plane crashes? Ban air travel! Just like in hypothesis testing, a well-balanced decision process requires a proper evaluation of the trade-off between False Negatives and False Positives. The higher the cost of a Miss, the higher is our willingness to bear the cost of a False Alarm. But since the latter must have a limit, in most cases the risk of a Miss cannot be eliminated. Planes will crash.

The Hindsight Bias promotes the design of excessively risk averse decision processes. Left to their own devices, decision makers have an incentive to impose a high cost of a False Alarm on others, in order to avoid the cost of a Miss on themselves – including the cost of self-blame and regret. As Baruch Fischhoff, who pioneered the study of the Hindsight Bias, put it:

Consider decision makers who have been caught unprepared by some turn of events and who try to see where they went wrong by re-creating their pre-outcome knowledge state of mind. If, in retrospect, the event appears to have seemed relatively likely, they can do little more than berate themselves for not taking the action that their knowledge seems to have dictated. They might be said to add the insult of regret to the injury inflicted by the event itself. When second-guessed by the hindsightful observer, their misfortune appears as incompetence, folly, or worse. (p. 84)

By skewing the error trade-off towards private risk aversion, the Hindsight Bias can transform risk management into CYA, promoting bureaucracy and inertia against initiative and accountability.

Interestingly, on the other hand, in the same way that a bad outcome does not prove that a decision process was badly designed, a good outcome does not prove that the process was well designed. Again, this might be true: a good outcome may indicate a good process. But it is wrong to conclude that a process is well designed because a good outcome occurred. Just as good decision makers may be wrongly blamed for a bad outcome, bad decision makers may be wrongly praised for a good one. As causes become clear after the event, the question becomes: Why did they see it? And the hindsight answer is: because they were brilliant, talented, prescient. Or better: they knew it all along – sheer genius.

Ultimately, this is also bad news for decision makers. The more they enjoy the praise after a good outcome, the more they will suffer and regret the blame after a bad one. A good decision process cannot be defined by its outcomes. It depends on a clear definition and a balanced attribution of the reward of Hits and the costs of Misses and False Alarms.

‘Ok, I get it (sort of). But what I really mean is: Who cares?

Which of course is a curt rendering of the second solution to thaumazein. In Baloo‘s immortal words: Forget about your worries and your strife.

This is one of mankind’s greatest achievements. After evolving into the only animal species able to ask Why, humans have been adopting thousand versions of the first solution to Leibniz’s question as the only obvious, unquestionable possibility. Over time, we have grown ever more curious – from Latin cura, meaning care, concern, trouble – and eager to know: we want to remove our cares and be se-cure, free from the trouble of the unknown. But even as our why-chains unfolded into more satisfactory local explanations, the ultimate answer remained a foregone conclusion, varying wildly in form according to location and upbringing, but not much in substance.

Questioning the obviousness of the first solution has always been, and still is, an unpopular concern. Leibniz’s question is better known in one of its woolly, anthropocentric versions: What on Earth Am I Here For? What is the purpose of life? Where do we come from and where are we going? Leibniz’s own answer continues to be, as it has always been, widely shared. At the same time, however, longer why-chains have been steadily pushing it away from the foreground of everyday life. As we keep deferring the ultimate answer, the need to have one has become, over time, less and less compelling. As Laplace – according to a famous but apparently apocryphal tale – told Napoleon, who had asked him why his Mécanique Céleste never mentioned God: Je n’avais pas besoin de cette hypothèse-là. Local explanations are all we need to know the world and lead in it a fruitful and meaningful life. So much for the ultimate answer: after all, who cares?

The second solution is not an alternative to the first. Most people share both. Dumb fringes aside, there is no longer a place in our lives for supernatural explanations. This is indeed mankind’s greatest conquest: the emancipation of evidence from the shackles of dogma. But the second solution does not really solve thaumazein: it just dissolves it. It doesn’t even say there is no solution – only that we don’t need one. As Baloo eloquently put it:

Don’t spend your time lookin’ around for something you want that can’t be found
When you find out you can live without it and go along not thinkin’ about it
I’ll tell you something true – the bare necessities of life will come to you.

Not long ago in the span of human history, Baloo would have been proscribed as a treacherous infidel. Today, to a greater or lesser degree, we are all Baloos. This doesn’t mean we no longer believe in Leibniz’s ultimate answer. In fact, most people profess a more or less authentic faith in a supernatural entity. But, knowingly or not, it is a highly personal faith, founded on Wittgenstein’s ‘mystical feeling’, rather than a certainty based on evidence.

Mankind’s progress rests on our comfort with uncertainty. We are curious, we do want to know, we dislike uncertainty. But we have learned to live with it – and to do so irrespective of our views on Leibniz’s question. If we agree with Leibniz, we already know the ultimate answer. If, like me, we don’t, we have no idea. In fact, we don’t even have an idea of what an ultimate answer may look like, or of whether there is one at all. We just don’t know: there is no evidence either way. That’s why my belief that there is an ultimate answer is a faith. Unlike Leibniz’s, it is not based on the Principle of Sufficient Reason, but on two different priors: a sense that explanations cannot go on forever and my perhaps irrational trust in the power of Why.

Leibniz’s answer makes no sense to me, but his question – even its woolly versions – resonate in me with a force that I can dampen but not extinguish. It is part of being human: no one, in his right mind, is impassive to thaumazein. Everyone cares.

‘What the heck is he talking about?’ is an entirely legitimate reaction to reading my latest posts. So let me explain.

The overarching theme in my blog is the relationship between beliefs and evidence, as fruitfully encapsulated in Bayes’ Theorem. In fact, so pervasive is my reference to Bayes that I am myself reminded of Abraham Maslow‘s saying: ‘To a man with a hammer, everything looks like a nail’ (one of Charlie Munger’s favourite quotes). My point, however, is that Bayes’ Theorem is not a tool. It is not what we should do. It is what we do. We are all Bayesian. Indeed, all living creatures are as, from our first breath, we try to figure out what on earth is going on:

Bayes’ Theorem describes learning: how we use evidence to update our beliefs and, therefore, our actions and behaviours. We formulate a hypothesis and try to decide whether it is true or false. We do so by collecting evidence and judging whether it is more consistent with the hypothesis being true or with the hypothesis being false. This includes all learning, from a baby’s first steps to the frontiers of science. In this sense then, yes, everything is a nail, in the same way that in physics every thing has a weight (or, more precisely, a mass). In investing, the hypothesis is whether the price is right or wrong. In a trial, it is whether the defendant is guilty or innocent. In dropping a ball, it is whether it lands on the floor or somewhere else. And in learning to walk, it is dozens of tottering hypotheses on the best way to stand up and go. The basic elements are always the same: priors, evidence – hard and soft, conclusive and inconclusive, positive and negative – faith, certainty, accuracy, confidence, trust.

We ask ourselves questions and give ourselves answers. Uniquely among animals, our questions are shaped as whys and our answers as explanations, which beget new questions and new answers, in a seemingly unending chain. Such infinite regress is the root of what the ancient Greeks called thaumazein: wonder at what there is. Wittgenstein called it the mystical: ‘Not how the world is, is the mystical, but that it is (Tractatus Logico-Philosophicus, 6.44). Since childhood, we find endless why-chains inconceivable. We can envisage boundless space, endless time, countless numbers, but not infinite questions. Explanations cannot go on forever. Every why-chain must have a last ring, an ultimate answer that ends all questions. As children, we expect that grownups have worked it all out. Thaumazein is our amazement at realising that they haven’t.

Dealing with such inescapable conclusion is part and parcel of being human. One way or another, we all do it. There are two common solutions. One is faith in a self-sustaining super-human entity. The other is to cast the problem aside, get on with life, enjoy it while it lasts and hope for the best.

I have great respect for the first solution, and for people who elect faith – a prior belief requiring no evidence, and that no evidence can change – as a guiding principle to leading a righteous life. The trouble with religion starts when faith is replaced by certainty. It is arguable whether there is more evidence in favour of the God hypothesis or against it. But it is undoubtedly a tough tug of war, where claiming that the evidence is overwhelmingly in favour can only be done by turning a blind eye to the evidence against, or by relying on some faulty piece of conclusive evidence. Authentic religious faith requires no evidence: it regards God as self-evident. It is when faith turns into certainty that religion is liable to produce its worst excesses.

The second solution is by far the most common, as it includes inauthentic faith, professed out of convenience, conformity or a misreading of Pascal’s wager. It is a practical solution: if we have no clue on how to solve a problem, we might as well dissolve it. Forget thaumazein and carry on.

Neither solution works for me. I have no sense of religious faith or, as Wittgenstein called it, the mystical feeling: ‘the feeling of the world as a limited whole’ (Tractatus, 6.45). I find it respectable and often admirable. But I can’t see God as self-evident. When I stare at God’s triangle all I see is a Penrose triangle.

I do, however, dislike religious certainty. I find it smug and irrational, and – as copiously proven in history, past and present – an instrument of senseless conflict and appalling violence.

As for the second solution, I do carry on. In fact, I made an early point of centring my life on solidly practical grounds. But a strong sense of wonder has never left me since childhood. Why is there what there is? Or, as Leibniz famously put it, ‘Why is there something rather than nothing?’ (The Principles of Nature and Grace, Based on Reason, in Philosophical Essays, p. 210). This is the first question, he said, we have the right to ask once we assume the Principle of Sufficient Reason. Leibniz – like Gödel, one of the smartest men ever alive, but also a practical and worldly polymath – thought the answer was obvious: God – the free, omnipotent, infinitely good creator of the universe. Alas, like the principle on which it is based, Leibniz’s answer – the first solution to thaumazein – makes no sense to me.

But the second solution doesn’t do it either. I agree that we can live with thaumazein, embrace uncertainty and get on with it. Faith is not a prerequisite of a righteous and meaningful life. There is no need for an ultimate answer. A life built on sand is no less beautiful. But that’s not the point: whether or not there is an ultimate answer has nothing to do with our need to know it. The fact that we can live without a solution does not mean that there is none, or that searching for it is a meaningless pursuit.

On the other hand, if there is an ultimate answer, it may well be completely out of our reach. Though it sounds like it, a Theory of Everything – physics’ ongoing attempt to unify General Relativity and Quantum Field Theory – would generate many further questions, rather than end all of them. The ultimate answer is not only a sufficient reason that explains everything. It is also a necessary one, explaining why everything is in the only possible way. It is like 5+3=8: Q.E.D. No more questions. As Gödel showed, even arithmetic is based on undemonstrated axioms. But these are intuitively and, to our complete satisfaction, self-evidently true.

Will we ever be able to comprehend why there is what there is? Or are we like apes, or even ants, staring at IBM Watson? I think we can. I believe that there must be an ultimate answer and that we can find it. This is my faith.

And this is the heck I am talking about.

The Principle of Sufficient Reason is crazy. But what is the alternative?

Let’s see. If every event has a cause, then whatever caused it was itself caused by other events, which in turn had their own causes, and so on. Every event is tied to other events, in an unbroken chain that must go back to the beginning of time. Everything that ever happened, and everything that will ever happen, was and will be destined to do so, according to the great laws of nature. Events are bound to obey those laws. Laplace’s demon, who knows them, can foresee everything: ‘the future, as the past, is present to its eyes’. Or, as Albert Einstein famously put it in a letter of condolence to the family of his lifelong friend Michele Besso, less than a month before his own death: “the distinction between past, present and future is only a stubbornly persistent illusion”. Einstein loved to discuss the nature of time with his friend Kurt Gödel over long walks outside the Institute for Advanced Study in Princeton. In general relativity, space-time is a four-dimensional manifold, a block universe where everything exists in a tenseless present. In the words of Hermann Weyl: “The objective world simply is, it does not happen”. Gödel approved.

If this is crazy, let’s try the opposite. Events are not tied together. They are free to happen. They are nothing before they become into existence and nothing after the cease to exist. Therefore, Laplace’s demon does not know them and cannot foresee them. Nature is not governed by laws. What we call natural laws are satisfactory local explanations, verbal acts that we share through language and agree to accept.

At first, this sounds as crazy, if not crazier. Surely, the ball is not free to land on the floor. The fact that it always behaves in exactly the same predictable way must mean that it is subject to a certain law, which we discover by experience but has nothing to do with us: it is there, whether or not we express it and accept it. How we express it may change through time: Aristotle’s natural place, Newton’s law of universal gravitation, Einstein’s theory of general relativity. But what it is must be timeless.

What does ‘there is a law’ mean, however? What is there, what we experience are not laws but events. An event is what happens, comes out into existence. Another word for it is phenomenon, from Greek phainesthai, meaning ‘to appear’, from phaos – light, the same root as in ‘philosophy’. But there is a difference: while a phenomenon appears, comes to light from where it ‘simply is’, an event comes out of nothing. It is nothing before it exists and therefore cannot be foreseen. As Hume pointed out, what we see are not laws but events, and their more or less constant conjunction – today we say higher or lower correlation – with other events. This is what we call experience. An experiment is an orderly gathering of what we see – evidence – in order to properly measure such correlation. But, as is (or should be) well known, correlation does not imply causation. More correctly, covariation is a necessary but not sufficient condition for causation. What is needed to turn correlation into causation is a satisfactory explanation. In the overstated words of Friedrich Nietzsche:

Against positivism, which halts at phenomena – “There are only facts” – I would say: No, facts is precisely what there is not, only interpretations. We cannot establish any fact “in itself”: perhaps it is folly to want to do such a thing. (Will to Power, Fragment 481 (1883-1888)).

(Incidentally, Nietzsche would have also been greatly helped by a Word Processor. Unlike Wittgenstein, he wrote and published many books, but had such little regard for order and consistency that he ended up being dismembered by his own interpreters – as one of them, my friend Sossio Giametta, puts it in Il Bue Squartato (The Quartered Ox). Contrary to Nietzsche’s hyperbole, Giametta has done a great job, over a lifetime, in establishing the key facts of Nietzsche’s life and work, and built on them what I think is the most complete, well-rounded and incisive interpretation of his philosophy).

Nietzsche went too far. There are facts – events, evidence, information, data. But he was right in saying that facts are not enough by themselves: they need to be interpreted, i.e. embedded within an explanation. We read facts, but we write explanations. And often our writing includes choosing the data on which our explanations are built. At the same time, however, explanations are interpretations of facts: there cannot be explanations without facts to be explained.

What there is, then, is evidence, which we experience, not explanations, which, as Bruno de Finetti put it, we invent. There where? In a timeless universe, evidence consists of the coming into light of eternal phenomena, governed by the great laws of nature in accordance with the Principle of Sufficient Reason. Alternatively, evidence consists of events, coming out of nothing and going back into nothing. In such a universe, far from being a stubborn illusion, time is the essential dimension along which events unfold. As Lee Smolin puts it:

Whatever is real in our universe is real in a moment of time, which is one of a succession of moments. (Time Reborn, xiv).

Smolin’s book is a valid attempt to move our perspective away from a timeless universe to a physical world where:

The past was real but is no longer real. We can, however, interpret and analyse the past, because we find evidence of past processes in the present.

The future does not yet exist and is therefore open. We can reasonably infer some predictions, but we cannot predict the future completely. Indeed, the future can produce phenomena that are genuinely novel, in the sense that no knowledge of the past could have anticipated them.

Nothing transcends time, not even the laws of nature. Laws are not timeless. Like everything else, they are features of the present, and they can evolve over time.

It is puzzling, therefore, to find that one of the cornerstones of Smolin’s proposed cosmological theory is none other than the Principle of Sufficient Reason (p. 122), which he customarily attributes to Leibniz rather than, more correctly, to Spinoza. How can that be reconciled with the reality of time, the unreality of the past and the openness of the future? Unsurprisingly, in the sequel to Time Reborn, written together with Roberto Mangabeira Unger, his co-author disagrees:

We cannot show, as the principle of sufficient reason would require, that the world has to be what it is or that it has to be at all. (The Singular Universe and the Reality of Time, p. 514).

This gives Smolin the opportunity to adjust his focus and reinterpret the principle as a ‘heuristic guide to suggest questions to ask and strategies to follow, in the formulation of hypotheses for cosmological and physical theories’ (p. 514). He calls it ‘the principle of differential sufficient reason’:

Given a choice between two competing theories or research programs, the one which decreases the number of questions of the form Why does the universe have property X? for which we cannot give a rational explanation is more likely to be the basis for continued progress of our fundamental understanding of nature (p. 367).

That is: better a theory that answers more whys than a theory that answers fewer. Oh well. A truism more than a principle – and very far from Leibniz’s bold claim. But Smolin is right. Contrary to its original meaning, a theory is not the contemplation of a revealed truth. It is a local explanation that gives a satisfactory answer to as many questions as it can.

Like beauty and evidence, however, satisfaction is in the eyes of the beholder. A few years ago, while they were still enthralled by Santa Claus, I asked my children: How can Santa, who is so fat, come down the chimney? They looked at me with mild contempt, shrugged their shoulders and said: He’s magic! As a counterbalance to their incessant inquisitiveness, children have a penchant for the almighty. Magic means to be able, to have power, from the Proto-Indo-European magh-, from which the noun might and the verb may come from. There is a perfectly satisfactory answer to why Santa comes down the chimney: because he can. In the same vein, our primordial ancestors had Santas for every noticeable natural phenomenon, and many fanciful strategies to propitiate their favours. As they grew up, however, curiosity prevailed. The history of science is the steady unfolding of natural explanations superseding magical shortcuts.

As Smolin says, a persistent search for more encompassing answers to our why questions is a powerful heuristic guide for continued progress. Following his principle means acting as if we are discovering causes. But we aren’t: as Einstein himself was well aware of, our theories are ‘free inventions of the human intellect’. They are our answers to our questions. We can call them laws, if we are particularly pleased with how well they work. But they are our laws – not the great laws of nature.

This doesn’t mean that we make them up. We write explanations, but we read facts. We may even decide which facts to read but, ultimately, we submit to them. Explanations are not arbitrary inventions. They are based on the ultimate arbiter: evidence. Evidence is what there is – not phenomena appearing in a timeless universe, but events coming in and out of existence, along time.

Inverting Weyl: The objective world simply happens, it is not.

Horrified outrage at the action of psychopaths has been a common reaction to the Charlie Hebdo massacre. Rightful as it is, however, it is a circular argument: they are mad because they do such things; and they do such things because they are mad. Such is the urge to stay clear of any form of justification, that no explanation is even attempted. There is nothing to explain: it is us against them – a clash of civilizations.

It is a sterile attitude. Explaining is not justifying. A cause is not necessarily a just cause. If we can explain Nazism, Stalinism, wars, crime and violence, we can explain Islamism.

This is done very well, in my opinion, at the Quilliam Foundation.

Maajid Nawaz, co-founder and Chairman of the Foundation, has written a great book on his personal experience and on the roots of Islamism. See him here in an excellent debate at the Richmond Forum.

Another enlightening source of information is this work (in French) by Dounia Bouzar and others on the indoctrination of young Islamists. The study, conducted on the analysis of 160 cases, shows that a major component of the process is exposure via Internet to a number of conspiracy theories, all of which have their typical hallmark: they are built on evidence which is portrayed as conclusive. If one is smart enough – savvy, shrewd, sensitive, pure, untainted, knowledgeable – to see the evidence, there is no need to weigh any other evidence: when you have eliminated the impossible…

This is the main route through which people come to believe weird things. It is not madness. And it is not us against them. They are among us. They are in us. Religion can be a catalyst, but it is not the trigger. Take a look at this:

It takes my breath away.

What do you call people who fall into conclusive evidence traps? There is no adequate, inoffensive term. I will call them conclusionists. What motivates them? Distrust of the other side is a key component: evidence matters as long as it is trusted. But it is not necessary. Trustworthy or not, conclusive evidence proves that the other side is wrong.

Another important factor is a craving for certainty – and here is where religion can play a major role. The more uncomfortable we are with uncertainty, the stronger is the urge to look for conclusive evidence, and the higher the risk that we make it up – or that we place our complete trust in people who tell us they have found it.

Of course, it takes more to turn a conclusionist into a murderer. But understanding the roots of his beliefs can be the key to shake him up, before it is too late.

Like all children, little Kurt Gödel kept asking ‘why‘ – so much so that his parents called him Der Herr Warum, Mr Why (Goldstein, p. 54). Unlike most children, however, he was hard to satisfy with a ‘That’s the way it is’ answer. Throughout his life, he was fixated on the Principle of Sufficient Reason: everything has a cause or, as he put it, Die Welt ist vernünftig: the world is intelligible. Like Laplace, Spinoza and Leibniz, going back to the ancient Greeks through the mediaeval Scholastics, Gödel saw the principle as self-evident. There is no such thing as chance: everything that happens was destined to become, according to the ‘great laws of nature’. Chance is just a measure of mankind’s ignorance of those laws, which are far too complex for us to comprehend. But for Laplace’s demon ‘nothing would be uncertain and the future, as the past, would be present to its eyes’.

Every event has a logical explanation, not simply of why it happened, but of why it was necessarily going to happen in no other possible way than the way it did. Gödel – one of the smartest men ever on earth, alongside an equally impressive list of predecessors – found it obvious. I find it crazy. There might well be a logical explanation for it, but I can’t help seeing it as a misguided principle and the source of a most treacherous pitfall: once we are convinced that there must be a cause, we are bound to find one, irrespective of how much evidence we gather to back it up.

Gödel started early. According to his brother, at the age of eight he suffered from joint rheumatism and high fever, which he learnt could cause permanent heart damage. Since then, and throughout his life, he remained convinced, based on no evidence, that he had an injured heart (Goldstein, p. 56). I remember when I was a child my father bought a one-volume health encyclopaedia – the latest stuff from America – which he soon came to hate and laugh about, because for any symptom he looked up there always was at least one horrible, graphically illustrated cause. This is the earliest memory I have of what has become my fixation: the probability of a hypothesis given some evidence is not the same as the probability of the evidence given the hypothesis. Astonishingly, Gödel didn’t get it. Perhaps, in keeping with his Platonism, he thought that, as there is no such thing as chance, there is no such thing as probability – not a deduction that Laplace would have shared. Be that as it may, things got no better as he grew older. Since there is no chance, Gödel did not believe in Darwinian evolution: “You know Stalin didn’t believe in evolution either, and he was a very intelligent man” was his jaw-dropping conversation stopper with Thomas Nagel (Goldstein p. 32). In middle age, he came to believe in ‘a vast conspiracy, apparently in place for centuries, to suppress the truth “and make men stupid”‘ (Goldstein p. 48). The same men, as Karl Menger recalled, who were responsible for destroying Leibniz’s manuscripts. “Who could have an interest in destroying Leibniz’s writings?” Menger had queried. “Naturally, those people who do not want men to become more intelligent,” was the logician’s reply (Goldstein p. 247). In the end, it all sadly turned into full-blown paranoia: to Oskar Morgenstern he ‘reported his suspicions that there were those who were trying to kill him, that his wife Adele had given away all his money, and that his doctors understood nothing of his case and were conspiring against him’. (Goldstein, p. 248).

One more proof that intelligence is not a one-dimensional affair. The Principle of Sufficient Reason is not just crazy: it can make one crazy, by luring him into the fabrication of baseless explanations.

In our framework, the principle can be expressed as: for any hypothesis H, there must be some evidence E such that P(H|E)=1 or 0. We call it conclusive evidence, of which there are four types. Conclusive evidence is sufficient to prove that H is certainly true or false: a Smoking Gun (positive evidence) or a Strangler Tie (negative evidence) are sufficient to prove that the hypothesis is true; a Perfect Alibi (positive evidence) or a Barking Dog (negative evidence) are sufficient to prove it false. Conclusive evidence reveals a causal relationship between the evidence and the hypothesis. Let’s take for example the first type: the gun is smoking because the suspect is guilty. That is: it is sufficient to see the effect of a Smoking Gun to conclude that a guilty suspect must be its cause – the efficient cause in Aristotle’s terminology. Notice that it is not necessary: the suspect may be guilty even without a Smoking Gun. But if there is one, he must be guilty. Likewise, assuming there is no smoke without fire, then smoke is conclusive evidence of fire. There is smoke because there is a fire. It is sufficient to see the effect of smoke to conclude that it was caused by a fire. It is not necessary: there can be a fire without smoke. But if there is smoke, there must be a fire. Finally, the ball landing on the floor is conclusive evidence that I dropped it. The ball landed on the floor because I dropped it. It is sufficient to see the effect of the ball landing on the floor to conclude that I caused it to drop. But in this case it is also necessary: the ball could not have landed on the floor unless I dropped it. Dropping the ball is not only conclusive evidence of it landing on the floor: it is perfect evidence. The ball lands on the floor if and only if I drop it.

This is the great thing about conclusive evidence: it brings certainty. As evidence accumulates multiplicatively, even a single piece of conclusive or, even better, perfect evidence allows us to resolve the evidential tug of war in one fell swoop: the hypothesis is certainly true or certainly false, irrespective of initial priors and any amount of evidence accumulated on the other side.

Hence we can see the allure of the Principle of Sufficient Reason: it says that there is conclusive evidence for any hypothesis. The evidence must be there somewhere: if we find it, even a single piece of it is enough to attain unassailable certainty. On the one hand, this is great: it spurs us into asking more and deeper questions in search for the ultimate answer. But on the other hand it is a great menace: the stronger is our desire for conclusive evidence, the higher is the risk that we dream it up. It is a common pitfall, where Gödel’s obsessions share room with Conan-Doyle’s naivety and the outright wackiness of assorted conspiracists.

Everybody likes certainty and, pace Benjamin Franklin, there are many things we are completely certain about beyond death and taxes. Also, to a greater or lesser extent, we all dislike uncertainty. Some positively hate it, some other are quite comfortable with it, and in some circumstances might even enjoy it. But, generally speaking, we all prefer certainty to uncertainty: we want to know.

Alas, very often we can’t. Most evidence is inconclusive. Not only about the future, but also about the present as well as the past. Will my child become a football champion? Is Linda a Greenpeace supporter? Was the cab in the accident green or blue? Most hypotheses are torn in a tug of war between confirmative and disconfirmative evidence, where neither side can prevail. When certainty is unattainable, we do not know if something is true or false: we can only believe it is probably true, and therefore probably false.

Such state of uncertainty goes well beyond our convenient analytical examples. Uncertainty is abundant and pervasive in most matters, from the trifling to the weightiest. Even if the Principle of Sufficient Reason were true, only Laplace’s demon, who can see perfect and timeless evidence, would be certain about them. The rest of us need to come to terms with our ignorance and, following in Laplace’s footsteps, acknowledge uncertainty and deal with it.

We can refuse the challenge, take comfort in the Principle of Sufficient Reason, resolve that there must be a cause and proceed to make it up. It is amazing what even very intelligent people can regard as self-evident. Less blatantly, we can pick and choose the evidence that best fits our dispositions. Or we can accept uncertainty, gather all the evidence that we can see, properly balance it and try our best to come up with well-calibrated probabilities.

Contrary to a common misconception, being comfortable with uncertainty is the very ethos of science. Science is not the repository of incontrovertible truths, “scientifically proven” on the basis of conclusive evidence. As Frank Hahn wrote on the front page of a book of his that I had asked him to sign in my student days: ‘You must regard everything in this book as provisional and not as “science”‘. An unforgettable piece of advice to a young graduate eager to find the truth.

One day every human being will realise that everything that has ever been written in any book has been written by us. The evidence is conclusive.

In discussing one of the main themes in this blog – Intelligent Investing – I have chosen to focus on concepts and methods rather than present specific investment cases. Exceptions have been Meyer Burger (here, here, here and here), Barratt Developments (here, here and here) and Elan (here and here). These have been successful investments. But since another major theme in the blog is Experts, and their more or less deliberate tricks aimed at trumpeting good calls and obfuscating bad ones, here is my take on Ferrexpo, which so far has been a spectacular dud.

I presented my investment case in Ferrexpo about a year ago at the valuconferences.com European Investment Summit. Here is an excerpt from the presentation. Since then, there have been two major developments: the Ukrainian crisis and a steep drop in the price of iron ore. On the first issue, I took the view that, short of an outright Russian invasion of Ukraine, which I regarded as very unlikely, Ferrexpo’s operations were not going to be significantly affected by the turmoil. This turned out to be right. On the second issue, however, I was wrong. I thought that the big three iron ore producers – Rio Tinto, Vale and BHP Billiton – would choose price over volume and limit the expansion of their production capacity. They didn’t. Their expansion plans went ahead, on the assumption that if they didn’t increase production others would, and with the intent of squeezing out the high cost producers, concentrated in China and subsidised by the local government. As a result, the iron ore price fell from 130 dollars per ton to 70. In this new environment, Ferrexpo decided to postpone the capital expenditure plans required to realise its own capacity expansion.

My investment case, however, was not predicated on the assumption of price stability. Of course, a big price drop can’t be good news. But, as long as it is not permanent, it should not a have a big impact on valuation. As shown in the graph, Mr Market sees all price moves as permanent:

The high correlation between the stock price of Ferrexpo and the iron ore price (the same is true for the other iron ore producers) means that the market has no idea of where the iron ore price will go. If the price is at 50, as it was between 2008 and 2009, it assumes that it will stay down there forever, and prices Ferrexpo below 1. If it is at 190, as it was in early 2011, it also assumes that it will stay up there forever, and prices the stock at 5.

But the graph also shows very clearly that the iron ore price is highly cyclical, and therefore there is no reason to expect that price levels – high or low – will persist. My valuation shows that Ferrexpo is still cash positive at low iron ore prices. Its production costs are not as low as they are for the big three – returns to scale are very important in the mining business – but the company is certainly not one of the high cost producers that would go out of business in a protracted low price environment. My numbers show that, even at a prudent normalised iron ore price of 100, Ferrexpo’s correct valuation would be above 3. At a price of around 1.8 in late 2013, there was – I thought – a sufficiently ample margin of safety.

Boy, was I wrong. In 2014, as the iron ore price began its steep descent and, to add insult to injury, the Ukrainian crisis intensified, the stock price dropped from 1.8 to 1.3 from January to September. Then in the last two months – shortly after I reiterated my valuation case at the Value Investing Seminar in Italy – the drop accelerated, taking Ferrexpo all the way down to below 0.7. So much for safety.

The move was preceded by steep downgrades from a number of brokerage analysts. No surprise there: as we have seen with Meyer Burger, analysts’ target prices (dashed line) just tread along market prices (solid line). In 2011, when the price was 5, Ferrexpo was a Buy for 69% of analysts. Now, with the price below 1, Buys have dwindled to 28%. Rather than accurate price forecasts, target prices are mere sentiment indicators.

Be that as it may, as an observant Blackstonian investor I proceeded to try to prove myself wrong. So I talked to some analysts, starting with those who had carried out the most aggressive target cuts, in some case moving straight from Buy to Sell. How could they possibly value at less than 1 a company that I thought was worth at least 3? Ukraine and the iron ore price, was the predictable answer. Yes, I replied, but how does that exactly translate into your valuation? I had to get their spreadsheets – which some of them kindly provided.

The first thing I checked, of course, was the cost of capital in the terminal value. Jack up the cost of capital and you can get any valuation you want. In the most interesting case, however, the WACC was a reasonable 10%. Minus 3% terminal growth, ok. But what’s that? Terminal FCF/(WACC-G) times 0.5? That’s got to be a mistake. I corrected it and, lo and behold, the valuation more than doubled. So I wrote back to the analyst and pointed that out. “Eeek. Well spotted” was his reply “but all I need to do is to take the terminal growth rate down by a smidgeon!” Which of course was not the case: even if he had halved the growth rate, he would have obtained a much larger valuation. But that’s not the end. Going deeper into the spreadsheet, I saw that, in the ‘Extraordinary Items’ line, the -14.6 million reported in the first half of this year was carried over into the future. I corrected that mistake and the valuation, based on the analyst’s own model, increased by a further third, to 2.6 times his price target. I pointed that out as well. Any qualms? No. The target was reconfirmed a few days ago.

Glaring insouciance apart, other models had 13% WACC and more (which has the same effect as halving TFCF/(WACC-G)), justified, as one analyst put it, by Ferrexpo’s ‘tough postcode’. And in all of them the last Free Cash Flow before the terminal period was calculated based on the current iron ore price, or thereabouts. This gives a value, in five to ten years from now, which is more than 50% lower than the 2014 level. No wonder: with the iron price at the current level, not for the next couple of years but for ever, iron ore mining would become a very tough business, viable, perhaps, only for the big three. This seems to me a very unlikely scenario.

So it is hard to convince myself that I am the one who’s got the valuation wrong. Analysts are lowering their valuations because the stock price is going down. And why is the stock price going down? Because the iron price is going down, of course – just look at the chart! You can’t but admire the exquisite circularity of the argument. Which, by the way, ends up sanctioning the analyst’s own irrelevance: know where the iron ore price is heading and you know where the stock price will go.

In a more basic sense, however, there is no question that I have been proven wrong: I thought it was very unlikely that Ferrexpo’s stock price could reach such a low level. Clearly, the margin of safety was not as wide as I thought. And if the iron price stays at current levels for a while, as everybody is expecting, the risk is that the stock price will stay there as well, as the chart suggests.

But I have been there before. If the reasons why the price is where it is are those above, Mr Market will catch up, sooner or later. As Andrew Harding, the CEO of Rio Tinto Iron Ore, said at last week’s Investor Seminar:

Our view remains that the developing world will continue to drive demand for iron ore, through urbanization, industrialization and increasing domestic consumption patterns. On the supply side, we have already seen significant curtailments of iron ore supply from the Chinese domestic sector, as well as reductions from non-traditional suppliers such as Indonesia and Iran. We expect around 125 million tonnes to leave the market this year in response to lower prices. Yes, the present price compared to recent prices is depressed, but the value proposition of our iron ore business runs over decades, not today and not tomorrow.

I wish Ferrexpo’s management could be as forthcoming, rather than taking it on the chin and keeping a very low profile – supposedly to earn the respect of unreciprocating City analysts.

A simple event – dropping a ball on the floor – is sufficient to generate a why-chain that stops not because we have reached the end of the chain, where there are no more questions to be asked, but because we are satisfied with a local explanation. Sometimes, however, we stop because a further question seems positively silly. If a child asks why 5+3=8, his dad shows him five fingers of one hand and three fingers of the other and says: this is what eight means. Why? Because when we put five objects together with three objects we call them eight objects – that’s all. At that point it looks like we have reached the end. Asking why 8, and not 9 or 15, sounds daft. Counting fingers is not just a local explanation: we are completely satisfied with it and cannot even think of any further question to ask. We do not need to repeat the finger experiment to prove that 5+3=8: all we need is one demonstration.

While proofs are ‘arguments from experience, as leave no doubt or opposition’, demonstrations are self-evident beliefs that are true on the grounds of pure reason and no empirical evidence can change. Proofs are open to Cromwell’s rule: I beseech you, in the bowels of Christ, think it possible that you may be mistaken. Should Philae send us a picture of a green rhinoceros, we would be obliged to conclude that, however fantastically unlikely it seemed, there are green rhinos on Comet 67P. But no amount of evidence can convince us that 5+3 is nothing but 8. When it comes to numbers, there are lots of questions to be asked, and some of them require a long and winding why-chain. But at the end of the chain, provided that no ring is broken, there is no other possibility: Quod Erat Demostrandum.

Q.E.D. is a thing of beauty. As such, it is in the eyes of the beholder and some people appreciate it more than other. I remember Walter, at university, a political science student who had passed all his exams except his bête noire: Maths I, and had asked me for help. Walter didn’t have any sense of Q.E.D. ‘Can’t you see? – I would tell him trying to explain some theorem – it must be true.’ ‘Why, why? – he would reply, staring at the page – why does it have to be that way? Can’t it be some other way?’ He was referring – tongue-in-cheek, but not entirely – to the art of political manoeuvring, of which his party heroes, the Christian Democrats, were refined connoisseurs. They were the ones who had coined the expression ‘Convergent Parallels‘ to denote and promote a certain degree of collaboration between them and the Communist Party, within the confines of their distinct political traditions. In Walter’s mind, such sophistication was in stark contrast with the crude rigidity of mathematical formulas. This goes a long way in explaining the muddled history of Italian politics. But, in a very different sense, Walter was right.

Any mathematical statement, be it 5+3=8 or the most convoluted theorem, is true within an axiomatic system. In fact, ‘Convergent Parallels’ is an oxymoron within the most ancient of them, Euclidean geometry. Euclid defined parallel lines as ‘straight lines which, being in the same plane and being produced indefinitely in both directions, do not meet one another in either direction’ (Elements, Book I, Definition 23). He then assumed that ‘if a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles’. This is known as the parallel postulate, the last of Elements‘ five axioms.

Axioms are statements assumed to be true by self-evidence or by definition, thus requiring no demonstration. Unlike the first four, however, the fifth axiom does not look as self-evident. Hence many attempts were made, over two thousand years, to demonstrate it as a theorem derived from the other axioms, until in 1868 Eugenio Beltrami showed it was impossible to do. A common line of attack in trying to demonstrate the fifth axiom was Reductio ad Absurdum, whereby a statement is shown to be true by showing that its contradiction leads to an impossible, absurd result. But when, around 1830, Nikolai Lobachevsky and János Bolyai explored what happened if they dropped the fifth axiom, they found many strange results but no contradictions. The fifth axiom can be shown to be equivalent to the Playfair’s axiom, according to which ‘in a plane, given a line and a point not on it, at most one line parallel to the given line can be drawn through the point’. Lobachevsky and Bolyai assumed instead that more than one line never meeting the given line could be drawn through the point, and found the ensuing Non-Euclidean geometry perfectly consistent.

It was an astonishing result. Until then, geometry was Euclidean geometry. Laplace – who had died just a few years earlier – listed it alongside astronomy and mechanics as one of the supreme feats of the human mind. Spinoza had mimicked Elements in his Ethics, using definitions and axioms to demonstrate propositions – not about lines and triangles, but about such ponderous concepts as nature, God and freedom. Immanuel Kant – who had also been dead for just a few decades – would have been startled to find out that our sense of space was not a pure a priori intuition, but one of many possibilities. Starting from their alternative axiom, Lobachevsky and Bolyai gave rise to Hyperbolic geometry. A few years later, Bernhard Riemann described a new geometry founded on a different alternative to Playfair’s axiom: no line never meeting the given line can be drawn through the point. It is called Elliptic geometry, and its simplest model is a sphere, where lines are meridians, which are parallel at the equator but do meet at the poles.

Christian Democrats were right. Parallels can converge – it depends on the geometry. I doubt my friend Walter was thinking along those lines, but his bemused protestations highlight the fact that even our hardest certainties rest upon undemonstrated assumptions. If we change the assumptions, we get different certainties.

This includes 5+3=8. Like geometry, Kant thought arithmetic contained synthetic a priori propositions. A priori, because they are independent of experience; synthetic (as opposed to analytic), because they say more than what is implied by their subject (Kant used 5+7=12 and argued that the concept of 12 is not contained in the concepts of 5, 7 and +). The ancient Greeks regarded arithmetic (from arithmos: number) as the epitome of episteme – absolute knowledge that is able to withstand any attempt at refutation. Like Euclidean geometry, arithmetic is an axiomatic system, in which a number of theorems are derived from the smallest possible set of axioms, using truth-preserving rules of inference. Given the axioms, the theorems are demonstrably true, independent of experience. But they are true within the system, i.e. relative to its syntax – the symbols, rules and principles with which the system is put together. In this sense, an axiomatic system is like a computer program, whose algorithms derive results (propositions and theorems) from initial inputs (definitions and axioms). Like a computer program, an axiomatic system is not about anything: change the inputs (e.g. the parallel postulate) and you get different results.

Axiomatic systems have two desirable properties. One is consistency: no proposition within the system can be shown to be true and false; the other is completeness: all propositions can be shown to be either true or false. In his Foundations of Geometry, published in 1899, David Hilbert showed that geometry, Euclidean as well as Non-Euclidean, is consistent and complete. But he could not say the same for arithmetic – on which geometry and most other systems are based. So, when the following year he announced his program, listing 23 unsolved mathematical problems and calling his fellow mathematicians to arms (‘This conviction of the solvability of every mathematical problem is a powerful incentive to the worker. We hear within us the perpetual call: There is the problem. Seek its solution. You can find it by pure reason, for in mathematics there is no ignorabimus‘), the second problem in the list was ‘Prove that the axioms of arithmetic are consistent’.

Alas, despite Hilbert’s buoyancy, many of the problems proved hard to solve. In fact, new problems – such as Russell’s paradox, discovered the following year in set theory – kept adding to the pile. But the biggest blow to Hilbert’s program came in 1931, one hundred years after Lobachevsky and Bolyai, when Kurt Gödel demonstrated that arithmetic is incomplete. More precisely:

Gödel’s First Theorem: If an axiomatic system, capable of containing arithmetic and defined by a finite syntax, is consistent, then it is possible to construct a proposition within the system that is true, but cannot be shown to be either true or false. Hence arithmetic cannot be both consistent and complete.

Let’s call the system S and the proposition P. The theorem says that if S is consistent then: a) P is true and b) P cannot be shown to be either true or false. Let’s then set P=’S is consistent’. Hence, if S is consistent then: a) ‘S is consistent’ is true and b) ‘S is consistent’ cannot be shown to be either true or false. It follows that:

Gödel’s Second Theorem: The consistency of an axiomatic system capable of containing arithmetic and defined by a finite syntax cannot be demonstrated within the system.

So much for the ultimate goal of Hilbert’s formalist program: to demonstrate that mathematics as a whole is self-consistent. It isn’t, starting from its very base: arithmetic. Hilbert wanted to demonstrate the consistency of arithmetic from within, without recourse to external ‘intuitions’ embedded in its axioms – especially the troublesome intuition of infinity. Gödel showed that such a finitist demonstration was impossible: consistency has to come from outside the system.

Of course, arithmetic is consistent: there is no arithmetic proposition that is both true and false. But – as with the parallel postulate and its lines ‘being produced indefinitely’ – arithmetic cannot get away from infinity. In fact, Gerhard Gentzen demonstrated consistency in 1936, using transfinite induction. Once we assume the existence of the infinite set of natural numbers – whose sum, remember, equals -1/12 – arithmetic is perfectly consistent: there is no doubt whatsoever that 5+3 is nothing but 8.

Gödel – a mathematical Platonist – was firmly convinced that natural numbers exist ‘out there’, just like Kant, who viewed mathematics as synthetic a priori. Indeed, Gödel interpreted his theorems as demonstrating the very necessity of natural numbers: without them, arithmetic is not even consistent. Whether or not we share Gödel’s outlook – I do not – Gödel’s theorems show that, like geometry, arithmetic is not a self-contained corpus of absolute truths. All arithmetic propositions – including 5+3=8 – rest on undemonstrated axioms, whose truth we assume to be intuitively, and to our complete satisfaction, self-evident. Q.E.D.

Nada se edifica sobre la piedra, todo sobre la arena…

While he considered self-evident that everything has a cause, Laplace knew that causes themselves are not self-evident. Events do not come with their causes and effects attached. We are not demons: we see events, but not their ties to other events. If we want to see the ties, we need to discover them. As David Hume memorably put it:

I shall venture to affirm, as a general proposition, which admits of no exception, that the knowledge of this relation [of cause and effect] is not, in any instance, attained by reasoning a priori; but arises entirely from experience, when we find that any particular objects are constantly conjoined with each other. (Enquiries Concerning Human Understanding, Section IV, Part I, p. 27).

We see the ball, our hand dropping it and the floor on which it lands. But we don’t see the ties between these events until we discover them through experience. By repeated experiment, we learn that, no matter how many times we drop it, the ball will always land on the floor, and it won’t land on the floor unless we drop it. It does so regularly, i.e. according to a rule. The ball is in free fall, but its fall is not free at all. Like us, Isaac Newton discovered the rule by experience – not a ball in his case but, famously, an apple. Unlike us, however, he realised that it was the same rule that forced planets to rotate around the sun, and called it the Law of Universal Gravitation.

Newton discovered the law but, to his eyes, the law was already there – he didn’t make it up. It was one of ‘the great laws of nature’, written in the grand book of the universe, not by trusted authorities but by nature itself for everyone to read, providing he knows the language in which the book is written. Galileo Galilei had expressed the same concept a few decades earlier, when, discussing the nature of comets with Orazio Grassi – a Jesuit astronomer writing under the pseudonym Lotario Sarsi – famously wrote:

Furthermore, I seem to detect in Sarsi the firm belief that in philosophizing one must rely upon the opinions of some famous author, so that if our mind does not marry the thinking of someone else, it remains altogether sterile and fruitless. Perhaps he thinks that philosophy is the creation of a man, a book like the Iliad or Orlando Furioso, in which the least important thing is whether what is written in them is true. Mr Sarsi, that is not the way it is. Philosophy is written in this all-encompassing book that is constantly open before our eyes, that is the universe; but it cannot be understood unless one first learns to understand the language and knows the characters in which it is written. It is written in mathematical language, and its characters are triangles, circles, and other geometrical figures; without these it is humanly impossible to understand a word of it, and one wanders around pointlessly in a dark labyrinth. (The Assayer, p. 183).

The belief that our description of the world coincides with the world itself goes back to ancient Greece, where logos meant ‘word’ – from legein: to say, speak and gather, collect – as well as ‘reason’, ‘logic’. What is spoken, gathered in the book of the universe is the world as it logically is. Later on, the Greek author of John, the fourth, ‘philosophical’ gospel, used the same term in the incipit: ‘In the beginning was the Word, and the Word was with God, and the Word was God’. The logos was Jesus: the link between mankind and the divine.

If what we say is what is, experience is just the unveiling of necessary laws. Balls and planets are bound to obey a law, and experience cannot but confirm it. Thanks to its mathematical formulation, the law allows us such wonders as precisely anticipating where the ball will land and predicting the exact time of tomorrow’s sunrise. We are sure about it, i.e. se-cure: free from the peril of the unknown.

But experience without peril is no experience at all. To experiment means being exposed to the possibility that the tie between the tested hypothesis and its conjoined evidence can fail. Failure can happen in two ways: False Positives – the ball does not land on the floor after I drop it – and False Negatives – the ball lands on the floor without me dropping it. If, after repeated experiment, I observe no failure, I conclude that the ball lands on the floor if and only if I drop it. This can be rephrased as: Dropping the ball is perfect evidence of it landing on the floor; or: Dropping the ball causes it to land on the floor, i.e. the ball lands on the floor because I drop it.

Since this unfailing regularity applies not only to our ball but to all objects (planets included), we call it a law of nature and name it gravity. Gravity is a satisfactory explanation or, as Spinoza and Laplace would say, a sufficient reason for why objects behave the way they do. An overwhelming amount of confirmative evidence proves that Newton’s law is true. But, as Hume pointed out, the law cannot be demonstrated on the grounds of pure reason. There is no a priori reason for which the ball must land on the floor and the earth rotate around the sun. The reason why we are certain about tomorrow’s sunrise is not that it is logically true, but that it has never failed to happen.

Like science is separating true from false, certainty is also a decision. Certus comes from cernere, which, like scire, means to distinguish, discriminate, discern. Laplace himself calculated the probability of tomorrow’s sunrise as (n+1)/(n+2), where n is the number of days in which the sun has risen so far. Hilariously (except to Young Earth Creationists), he assumed n=1,826,213, or 5,000 years, and concluded that ‘it is a bet of 1,826,214 to one that it will rise tomorrow’. To which, however, he hastened to add:

But the number is incomparably greater for him who, recognizing in the totality of phenomena the regulating principle of days and seasons, sees that nothing at the present moment can arrest the course of it. (A Philosophical Essay on Probabilities, p. 19).

What can be incomparably greater than 0.999999? It is BR=1: the faith that nothing can arrest the course of the Mécanique Céleste that Laplace had masterfully described in his five-volume oeuvre. His certainty in it was de jure as well as de facto. Sunrise was not only a sure bet: it was the demonstration of an inexorable principle, which was revealed by experience but in no way endangered by it.

If experience is merely reading from a book that has already been written, the only danger is misreading it – as, to his embarrassment, happened to Galilei in his exchange with Sarsi. The Assayer, written in 1623, was Galilei’s rejoinder to Sarsi’s Libra Astronomica ac Philosophica. Libra is a balance, on which Sarsi weighed different views about the origin of comets, three of which had appeared in 1618. Sarsi favoured the view of Tycho Brahe, whose cosmological system was approved by the Jesuits. Brahe thought comets were actual celestial bodies, rather than atmospheric phenomena due to sunlight shining on water vapour, which was Galilei’s view – ironically close to the traditional Aristotelian notion. Galilei opposed to the coarse Libra, on whose arms Sarsi had weighed a mixed bag of fanciful arguments, his refined Saggiatore, the ‘exquisite and fair’ balance used to weigh precious metals. Alas, it did not help: resembling what happens in much of today’s economics, Galilei reached mathematically precise but entirely wrong conclusions.

Galilei did not realise that we don’t just read the great book of the universe: we write it. What we call natural laws are local explanations that satisfy us and successfully stop us from asking further questions. But the reason we stop is not that there are no more questions to be asked. On the contrary, each answer begets new questions.

The more we explain, the more we ask. Our earlier ancestors were easily satisfied. As long as explanations came from a trusted source, they could be made of the weirdest concoctions of myths and legends – many still very popular. But reasonable explanations could be as satisfactory, and as wrong: if every celestial body rotates around it, and every object falls towards its core, the earth must be the centre of the universe. It is our childish urge to keep asking why that breeds new explanations for some of the questions that old explanations could not answer.

At the same time, accepting local explanations, and getting on with them without further questioning, is as important a prerequisite of our existence. Other animals get by without explanations. They know what happens if, not why it happens. We need to know why, but also decide what to believe. We do so individually, ultimately leaning upon soft evidence emanating from trusted sources. To believe is to hold dear, to love. Like credere in Latin, it comes from the heart. Each of us can believe anything. But mankind as a whole has nothing else to lean upon but itself. It is we who decide what is written in the book of the universe. Our explanations are verbal acts that we share through language and agree to accept. As Albert Einstein, who rewrote gravity into his theory of general relativity, put it:

Fundamental principles … are free inventions of the human intellect, which cannot be justified either by the nature of that intellect or in any other fashion a priori. (The Herbert Spencer lecture, in Ideas and Opinions, p. 272).

We are children who keep asking questions and grownups who keep inventing answers, with no idea of what the ultimate answer that ends all questions looks like.

Once they figure out the pointlessness of why-chains, children learn to accept local explanations and move on. Explanations are stories that satisfy us and stop us from asking further questions. Most people are content with short, simple answers. Some other are harder to satisfy and require multiple unfolding. But, sooner or later, we all stop and accept an explanation emanating from a trusted source.

For example: Why did the ball land on the floor? Because I dropped it. This is a wholly satisfactory explanation for most intents and purposes – what else would we want to know? A lot, actually. Why did the ball land on the floor, rather than, for example, stay in mid air? Because of gravity. Gravity? What is gravity? It is one of the four fundamental forces of nature. Forces? What is a force? And why are there four of them? And why does gravity work that way and not in some other way? Or we may ask: Why did the ball land on the floor, rather than go through it? Because the floor is made of a hard material. Hard? What does hard mean? It means that the material consists of tightly arranged atoms. Atoms? What are atoms? Atoms are units of matter composed of a nucleus, made of protons and neutrons, and surrounded by a cloud of electrons. A cloud? What’s in between the nucleus and the electrons? Not a lot, just empty space: if an atom were a football stadium, the nucleus would be a small marble in the middle of it.

‘What? So, going back to my question: why doesn’t the ball – which itself must be made of mostly empty space – go through the floor?’ ‘Because electrons are bound to the nucleus by the electromagnetic force.’ ‘By what? Look, I asked for an explanation, not a headache. I’ve had enough. Whatever you say: I trust you.’ ‘Wait, wait, I haven’t told you about subatomic particles…’ ‘No thanks, I said I’ve had enough. But let me ask you: do you know everything?’ ‘Me? Not at all. I know a lot, but there are still so many unanswered questions. Every answer begets new questions. In fact, I don’t even know what knowing everything means’.

This is mankind’s ultimate enigma: each of us has someone else to trust, but mankind as a whole doesn’t. A well-tried solution to the enigma is to say that there must be an entity – there cannot but be one – which mankind can lean upon. Descriptions vary, but one trait is in common: the entity is such that it needs nothing else – it is self-sustaining. This is a necessary trait: without it, we are just moving the goalposts. But it is a hard one to fathom. It is like trying to figure out the last number: a hopeless endeavour. So, while sympathetic with the goal – enigmas must have a solution – we are at a loss to find one. Hence we revert to the same pattern: trust someone else who knows.

The time-honoured solution approved by trusted authorities has been to evoke some form of supernatural deity, possessing all the required traits, and more. But self-sustainment does not require a deity. The entity doesn’t have to be someone. It can be something, a part of nature or, indeed, nature itself: Deus sive natura. Spinoza called it substance:

By substance, I mean that which is in itself, and is conceived through itself : in other words, that of which a conception can be formed independently of any other conception. (Ethics, Part I, Definition III).

Sub-stance is what stands under, or under-stands, everything. In this sense, turning it upside down, it is the subject matter of what the ancient Greeks called episteme: knowledge that stands firm over (epi-) everything, as absolute truth rather than mere opinion – doxa. Episteme is the unquestionable knowledge of the laws that determine what becomes, or comes to be. As such, it enables the prediction and anticipation (pre-capture) of what comes out: events.

Out of where? Good question. If episteme is able to foresee them, events must be somewhere before they ex-ist. Events do not befall out of nowhere, do not become out of nothing: they appear, come into view from where they already are. Not by chance, then, but be-cause:

From a given definite cause an effect necessarily follows; and, on the other hand, if no definite cause be granted, it is impossible that an effect can follow. (Ethics, Part I, Axiom III).

This is the Principle of Sufficient Reason, as well expressed, one hundred and forty years later, by Pierre-Simon Laplace:

All events, even those which on account of their insignificance do not seem to follow the great laws of nature, are a result of it just as necessarily as the revolutions of the sun. (…)
Present events are connected with preceding ones by a tie based upon the evident principle that a thing cannot occur without a cause which produces it. (A Philosophical Essay on Probabilities, p. 3).

Famously, Laplace imagined an ‘intelligence’ who knows the causes of everything: ‘For it, nothing would be uncertain and the future, as the past, would be present to its eyes’ (p. 4).

If you were impressed with Dr Wise, who could figure out whether you would open one or two boxes, or with the Janken robot, who would beat you hands down at Rock-Paper-Scissors, you would be utterly awestruck by what has somehow come to be known as Laplace’s demon. The demon knows everything – and I mean every thing – because he knows their causes: the links that tie every event to the events that caused it and to the events that it will cause. Events are not free to happen, but are tied together in a network of causes and effects that explains the past and determines the future.

Is such causal network the self-sustaining entity that solves mankind’s enigma? We don’t know. Our feeble mind is and will always be ‘infinitely removed’ – Laplace’s words – from the demon’s knowledge. All we can do is yearn for it, take care of it, love it. This is the origin of the word philo-sophy, where sophia is wisdom, coming from saphes – clear, evident, true – and in turn from phaos – light. The Principle of Sufficient Reason can only be a Faith: a prior belief that – as Laplace saw it – is self-evident and must be true on the grounds of pure reason.

We may or may not share such faith. But if we do, we need to know what it implies: there is no such thing as chance. All events, no matter how big, small or insignificant, are bound ‘to follow the great laws of nature’. Chance is just mankind’s word for our ignorance – lack of knowledge – of those laws. If you are nodding with approval, you should realise that you were destined to do so, as much as you were destined to exist, your parents to conceive you, and their parents to conceive them, and so on. And that everything you will do tomorrow, and for the rest of your life, is just what you were ordained to do. Likewise, Germany was destined to lose World War II and to win this year’s FIFA World Cup, I was destined to write these words, and the bird that just perched on the tree outside my window was meant to do just that, on that particular branch, with that many leaves, each of them with that shape and cellular composition, each cell with that number of mitochondria, and each mitochondrion… – you get the drift.

Laplace’s demon knows all this, and much, much more. He knows why everything happens, has happened and will happen, because he knows all the laws that determine all events, no matter how complex and chaotic they may be. Besides, he knows why those laws are necessary and cannot but be so. He knows the last ring of all why-chains and the ultimate answer that end all questions. He knows the absolute, untied truth: episteme.

Everything is, and is bound to become according to necessary laws in the only possible way. Laplace found this self-evident. He trusted it to be the solution to mankind’s enigma. I find it nuts. But what is the alternative?

As they fire their what happens if questions, small children regard the answer as perfect evidence, just as they have been doing in infancy on their own through repeated experiment. As a baby drops a ball from his hands, he learns that the ball always lands on the floor, and that it never lands on the floor unless he drops it.

Perfect evidence is if and only if, necessary and sufficient. Likewise, as a child asks his dad what happens if he puts a finger on a flame, he intends the answer to mean that the finger will always burn if he puts it there, and that it won’t unless he does.

Dad’s explanation of why this is so strengthens the child’s belief by assigning a cause to the effect. But such fatal move marks the end of parents’ aura of infallibility. Once the why-chain monster is unleashed, the child will soon realise that mum and dad do not have all the answers. Each explanation begets a new question and there is no super-ring at the end of the chain – a brute conundrum which he will learn to deal with one way or another, but will never be able to solve.

As they give up on what soon turns out to be a weary why-game, children learn to accept and get on with local explanations. Parents continue to be their main source of evidence. Some of it will still be perfect and some will be conclusive. But an increasing proportion will be imperfect and inconclusive.

For example: a child picks up a hardened chewing gum from the floor and asks: ‘Can I put it in my mouth?’ After a horrified ‘No!’ comes the next question: ‘Why? What happens if I do?’, to which the correct answer: ‘Nothing, most likely’, is clearly inadequate. The right answer is some variation of: ‘It is dangerous’, which opens up a whole new world: the world of possibilities, where things can happen, rather than will happen. ‘Dangerous’ is an aptly concise way of saying: Take 20,000 children, have half of them chew a gum picked from the floor and the other half a gum from a sealed packet. After a while, some children will get sick. You will see that there will be more sick children among those who chewed gums picked from the floor than among those who chewed packed gums. In a nightmare, the child could reply: ‘Really? Has such an experiment actually been performed?’ Thankfully, it doesn’t happen: children trust their parents.

Trust is measured by the Likelihood Ratio. In this case, the tested hypothesis is: ‘Chewing the floor gum is bad for me’, and the evidence is: ‘Dad says so’. The Likelihood Ratio is the ratio between TPR: the probability that dad says that chewing the floor gum is bad, given that it really is bad, and FPR: the probability that dad says it is bad, given that it actually isn’t. An infallible dad – the perfect hero of small children – has TPR=1 and FPR=0: he is never wrong. Children soon realise that is not the case – a developmental stage that smart parents accompany and encourage and dumb parents vainly oppose. Most parental evidence is imperfect. Still, while no longer infinite, parents’ Likelihood Ratios remain large and, multiplied by prior odds – equal to one for most hypotheses to perfectly ignorant children – determine children’s posterior odds: if dad says so, it must be right – well, almost certainly.

It is, alas, a temporary biological advantage. As children grow up, their trust into whatever parents say is bound to be challenged by other sources of soft evidence – other relatives, teachers, friends, and then TV, books, internet and the whole wide world. ‘Dad is right’ becomes less and less a forgone conclusion. As they add new sources of evidence, children learn to multiply their Likelihood Ratios, in the same way that, since the dawn of civilisation, the Law has been using the judgement of reputable people to reach verdicts. Parents will no longer be the only jurors and, in most cases, will not even be part of the jury. Which, of course, is as it should be.

The only way parental trust can produce a lasting influence on children is by permeating their priors. It is what we call, in its broadest sense, education – a set of beliefs, values, principles and priorities that form the basis upon which evidence is evaluated. Education is not just teaching what happens if, what is true or false, right or wrong: it is explaining why.

Since childhood, we test hypotheses using available evidence to update our priors. Whether we judge a hypothesis to be true or false depends on the evidence, but is based on priors. Evidence is placed within the confines of what we already know. Strong priors, founded on good explanations, help us avoid prior indifference.

This is the power of Why. We have seen it in Tversky and Kahneman’s cab problem. When people are just told that 85% of cabs are Green, they go along with the witness. But when they are told that Green cabs are involved in 85% of the accidents, they successfully reduce their posterior probabilities close to the correct value. They do so because they have a good reason to believe that Blue cabs are less likely to be involved in an accident: sloppier Green cab drivers. Similarly, we have seen it in Newcomb’s Paradox. If we are just told that Dr Wise is infallible, we are tempted to open both boxes. But if, as in the Janken version, we are told why the robot is infallible, we easily recognize how foolish it would be to bet against it.

Likewise, in our child footballer story it is easy to imagine that, if the father had a good reason to be sceptical about his child’s chances of success, he would have taken the coach’s opinion with a large grain of salt. Having a good reason to doubt homeopathic medicine, supernatural powers, conspiracy theories and assorted nonsense provides an effective shield against seemingly compelling evidence. If Iago had not been able to melt Othello’s solid priors on his spouse’s loyalty, the Moor would not have killed Desdemona. In general, when evidence runs counter to well-founded priors, updates occur, by and large, according to Bayes’ Theorem.

The bad news, however, is that Bayes’ Theorem works in the same way on equally strong but ill-founded priors. Homeopaths, spiritualists and conspiracists will not be swayed by the most genuinely compelling evidence. Like any power, Why has a dark side. A good explanation is a story that satisfies us and stops us asking more questions. We are satisfied that the earth is round. The trouble is that people can find satisfaction in very odd places.

As our ancestors evolved verbal communication, we can imagine that one of their earliest sounds eventually gave rise to the Proto-Indo-European labiovelar */kw/. This became /hw/ in Proto-Germanic, wh in English and qu in Latin. In a wholly inscrutable world to their emerging consciousness, short, rudimentary expressions of wonder must have been the most common. Over time, they turned into English wh-words: what, who, where, when, and Latin qu-words: quid, quis, quo, quando.

Although we are the only species able to verbalise them, other animals are evidently aware of identification, spatial location and temporal sequence. We don’t know what it is like to be a bat, but – pace Thomas Nagel – it is not difficult to imagine that a bat must have some sense of what (an insect, an eagle, a cave, another bat), very much a sense of where, and also a sense of when, at least in the basic forms of now and later, and possibly before. The closer the animal is to us, the easier it is for us to empathise: simpler with a cat, harder with a rat, near impossible with a gnat, or a brain in a vat.

But there is one wh-word that makes no sense to any other animal: why. When a bat eats an insect, or is eaten by an eagle, it has a sense of what happened, where and when, but not of why. Only humans know why something happened: it is be-cause – what caused it to be.

Answering why is to explain, i.e. to make plain, to unfold. As such, it is a verbal act, impossible without a well-articulated language. Parents know: at one point, children start bombarding them with why questions. But not immediately. Until they develop enough vocabulary, children don’t ask why. Rather, they are interested in knowing what happens if. This is the same question they have learnt to answer by themselves since birth. By the time they start talking, children already know what happens if, for example, they release an object from the grasp of their hands. They know that the object always does more or less the same thing: it falls down – rather than e.g. float in the air, fly out of the window or disappear. They have learnt this by experience, which literally means getting over the peril of the unknown. Just like a cat, a baby has no idea of why an object falls to the ground, but he knows that it does – unfailingly and, therefore, predictably – through the hard evidence of repeated experiment.

Language gives children the opportunity to learn not just from their own experience, but also from the experience of others, and not just by observing what others do, but by listening to, and trusting, what they say. So, for example, to learn what happens if they put a finger on a flame they no longer need to try themselves the hard way: they can ask their dad, and trust that, if dad says it happens that way, that’s the way it always happens. It is through such extension of experience from hard to soft evidence that parents introduce their children to the wonderful world of why. Once they not only tell them what happens if they put a finger on a flame, but explain to them why it happens, they have opened the floodgates:

You can’t do it because the flame will burn your finger.
Why?
Because fire is very hot and your nerves will send a message to the brain to retract the hand.
Why?
Because otherwise your finger will burn.
Why?
Because it can only bear a certain temperature.
Why?

As every parent knows, the dreaded why-chain has only a few rings, before ending abruptly in a more or less emotional …because that’s the way it is!

Each ring is a cause, explaining a fold in the tangle of reality. Children would never stop unfolding: no ring looks like the last to them. So they keep asking, under the reasonable expectation that grownups must have surely figured everything out. The realisation that they haven’t – that in the end no one really knows – is a critical stage in children’s development, roughly coinciding with the acquired awareness of another nasty surprise: death. It is through sombre resignation that children stop asking why and learn to accept local explanations. As in:

Why do I have to brush my teeth every day?
Because otherwise you get caries.
Wh-… All right, fine, whatever.

By that time, parents have ceased to be the exclusive source of soft evidence. As children extend their social reach – go to school, make friends – more and more of what they know is because somebody else said it, they trust it to be true and get on with it.

As adults, most of what we know results from the accumulation of trusted soft evidence. This sets us apart from other animals. They only know by hard evidence and imitation. We know – incomparably more – by soft evidence and trust.

The answer to why is be-cause: a story that only humans can tell.

A few days ago I received an email about my recent post on priors. A reader said he was very surprised that I accepted the claim that saturated fat is not bad for the heart. He said the evidence that saturated fat leads to heart disease is overwhelming and that there is no significant evidence contradicting these findings, other that the activity of groups financed by the meat and dairy industry.

I was also surprised. I had used the New Scientist article on saturated fats only as an example to illustrate the point that even the most well established hypotheses can be challenged by new evidence. I had not said that the evidence proved that saturated fats are ok – nor did the article. New Scientist – not a glossy fashion magazine – referred to a couple of recent studies, published in reputable academic journals (here and here), that shed some guarded doubts on the strength of the received wisdom. The article also said that the papers had been strongly criticised (see here and here) and, after a thorough and informative evaluation of the issue, concluded:

So while dietary libertarians may be gleefully slapping a big fat steak on the griddle and lining up a cream pie with hot fudge for dessert, the dietary advice of the 1970s still stands – for now. In other words, steak and butter can be part of a healthy diet. Just don’t overdo them.

The main point of my post – perhaps the main point of my entire blog – is that, since most evidence is inconclusive, priors matter, and that neglecting them – pretending they do not exist or they are not needed – is a major and consequential fallacy. Prior indifference does not do away with priors – it just sweeps them under the carpet. One may think he is avoiding them, but all he is doing is inadvertently assuming they are 50/50.

Indeed, to be ‘blinded by evidence‘ means that one reads the New Scientist article, understands that there are two camps to the hypothesis and concludes – perhaps aided by the flippant finale – that ‘the truth is in the middle’. That is a mistake: new evidence joins a tug of war where – as it is the case here – one side may already be much stronger. That strength should therefore be reflected in the priors against which the new evidence is evaluated. Only if the new evidence is extraordinarily strong itself – always a possibility – will it be sufficient to counterbalance or even overturn the priors. If not, all it can do is standing against them, waiting for reinforcement that may or may not come in the future.

At the same time, solid priors should help us avoid the opposite mistake: distrust. When the tug of war is in the balance, each side is prone to believe the other side is cheating. The temptation is strongest when one side is losing: the harder the evidence on the other side, the sharper the urge to call it fake – it is indeed the staple weapon of conspiracy theorists.

But when we are on the winning side – when the evidence about what we believe is well established – we should resist our basic instincts. So doubting the bona fide of the authors of the two papers in question – or, worse, accusing them of being on the payroll of industry – is unnecessary as well as gratuitous. Unless, of course, one has strong evidence. If not, his priors are wrong!

What is an economy? Ultimately, it is a number of people, mostly strangers to each other, but connected through an intricate network that allows them to produce goods and services in quantities and varieties immensely larger than what they could obtain on their own.

People produce with the aid of capital, both human and non-human. Human capital provides labour services and earns a wage; non-human capital provides non-labour services such as risk bearing, lending, and housing, and earns dividends, interest and rents. The total value of production equals the total value of returns, or revenues.

People produce in order to consume. Consumption is their ultimate payoff. Therefore, just as the value of a company’s capital is the discounted sum of expected dividends, the value of an economy’s capital is the discounted sum of expected consumption. Consumption C corresponds to dividends D and production Y to earnings E. Hence retained earnings E-D are equivalent to savings Y-C=S and the retention ratio H=1-D/E is equivalent to the saving rate s=S/Y. As the sole factor of production, capital K, both human and non-human, corresponds to the company’s book value B. Just as the book value is increased by retained earnings, ∆B=E-D, capital is increased by savings: ∆K=Y-C. Therefore I=S, where I is new investment. In the long run, the return on capital R equals the production-capital ratio Y/K and the growth rate G=H∙R equals (S/Y)(Y/K) =S/K. Hence G=(S/Y)/(K/Y)=s/β, where β=K/Y is the capital-production ratio.

The latter equality is what Thomas Piketty calls the second fundamental law of capitalism: β=s/G (Le Capital au XXIe Siècle, p. 262). Since β=1/R, the law can also be written as s=G/R, which, since s<1, implies R>G. Much of the debate around Piketty’s book revolves around R-G. Our framework makes it clear that, just as a company that entirely retains its earnings is worth nothing to its shareholders, the capital of an economy where all production is saved and none is consumed is, unsurprisingly, worthless. Since consumption cannot be negative, R must be larger than G – for the same reason that the infinite sum of natural numbers must be positive.

Piketty’s first law (p. 92) also holds in our framework: α=R∙β, where α is the share of non-human capital revenues on total production. This is actually not a law but an identity. In our framework, total capital K is the sum of human capital and non-human capital: K=KH+KNH and total production is the sum of human capital revenues and non-human capital revenues: Y=YH+YNH. Hence α=YNH/Y. This can be decomposed as (YNH/KNH)(KNH/Y), where YNH/KNH=RNH is the return of non-human capital and KNH/Y=βNH is the non-human capital-production ratio. In Piketty’s example, βNH=6, RNH=5% and therefore α=30%: non-human capital earns 30% of total revenues.

A major difference between our framework and Piketty’s is that he defines capital just as non-human capital (p. 82), expressly excluding any consideration of human capital. This doesn’t mean, of course, that in Piketty’s economy there is no labour. In fact, in his example labour earns 70% of total revenues. In our notation, 1-α=YH/Y=(YH/KH)(KH/Y), where YH/KH=RH is the return of human capital and KH/Y=βH is the human capital-production ratio, and the shares of non-human and human capital revenues on total production add up to one:

RNHβNH + RHβH = 1

Although he does not consider it, in Piketty’s economy human capital has an implicit return RH and is implicitly worth a multiple of production βH. And, since the total capital-production ratio β is the sum of the non-human and the human capital-production ratios:

β = K/Y = βNH + βH

then βH=β-βNH, where remember β=1/R=s/G.

This shows an inconsistency in Piketty’s model. In his example, s=12% and G=2%, hence β=6, which is the same multiple he uses to calculate α. But this implies βH=0, which is impossible: if human capital earns 70% of revenues, it must have a value – however implicit – which must be worth some multiple of production. Hence β must be bigger than βNH, which requires either a higher saving rate or a lower growth rate. For example, with G=1% we have β=12 and therefore βH=6: human capital is worth as much as non-human capital and, since it earns 70% of revenues, it must return 70%/6=11.7%. Alternatively, βNH must be smaller than 6, which in turn means that either α must be smaller than 30% or the return of non-human capital must be larger than 5%. For example, with βNH=4 and RNH=7.5% (so as to preserve α=30%), we have βH=2 and RNH=35%.

One can play around with the numbers, but the important point is that A Country is Not a Company. While labour is a cost to a company and is not part of its capital, human capital is very much part of an economy. The saving rate s measures the savings of all revenues, from human as well as non-human capital, and G is not only the growth rate of non-human capital but incorporates the growth of human capital, including, very importantly, increases in the labour force.

The return on total capital R=G/s is the discount rate of the consumption stream that determines the value of the economy’s capital. As s<1, R is always larger than G. In the long run, R equals the production-capital ratio Y/K, and its inverse β can be seen as the economy’s “PE ratio”, with K=Y/R corresponding to the Tangible Value of the economy’s capital.

Our framework shows that there is no relationship between R-G and the distribution of revenues and wealth between workers, the owners of human capital, and so-called rentiers, the owners of non-human capital. Distribution depends on the relative size of non-human vs. human capital and on their relative returns. s, G and R have nothing to do with it.

Post-war Italian Socialists and Communists supported workers in their class struggle against capitalists. That’s why they wanted to call Italy a “democratic workers’ Republic”. They reckoned that, if for the time being Italy could not be a socialist country, it might at least have a socialist-sounding name. It was, after all, only a matter of time: socialism and communism were ineluctable and just. Ineluctable, for intricate reasons that Karl Marx and its disciples had figured out and that most supporters took for granted in good faith; and just, for a reason that everybody could understand: labour is the ultimate source of value.

Labour is everything. Capital is nothing but a tool of production created by past labour. This is the Labour Theory of Value: the economic value of a good equals the amount of labour required to produce it.

One didn’t need to be a Socialist to subscribe. Marx got the idea from Ricardo, but Abraham Lincoln, for one, also agreed:

Labor is the true standard of value.
Labor is prior to, and independent of, capital. Capital is only the fruit of labor, and could never have existed if labor had not first existed. Labor is the superior of capital, and deserves much the higher consideration.

And Keynes was not a closet socialist when he wrote:

I sympathise, therefore, with the pre-classical doctrine that everything is produced by labour, aided by what used to be called art and is now called technique, by natural resources which are free or cost a rent according to their scarcity or abundance, and by the results of past labour, embodied in assets, which also command a price according to their scarcity or abundance. It is preferable to regard labour, including, of course, the personal services of the entrepreneur and his assistants, as the sole factor of production, operating in a given environment of technique, natural resources, capital equipment and effective demand. (General Theory, Chapter 16, p. 213).

In a broad sense, the primacy of labour is trivially true: capital goods are ultimately made by people – who else? – and even natural resources need people to perform their economic function, through farming, mining etc. Without people’s labour there is no economic value – in fact there is no economy.

But, as Keynes observed, people need aid: from technique (today we would say technology or skills), capital equipment and natural resources. Labour may be seen as the sole – or rather the ultimate – factor of production. But nothing is produced by labour alone.

Take a simple example. Teachers produce teaching. At first sight, teaching is a service solely produced by teachers’ labour. But such a narrow perspective entirely misses the big picture. In order to teach, a teacher has to live. So he needs food, clothing, housing. To move around he needs transportation. To stay healthy he needs medical care. To enjoy life he needs restaurants, cinemas, sports. To teach he needs schools, books, pencils. And so on – you get the drift. The production of teaching requires a complex network of countless other goods, services and resources, ultimately produced by other people.

How does the teacher get all those things? Basically, in three ways. First and foremost he can buy or rent them. Second, he can get them from his properties: e.g. housing from his own house, transportation from his own car. Third, he can get them for free, i.e. without direct disbursement: e.g. a National Health Service, a road network on which to drive his car, a police service to keep him safe, a classroom from the school that hired him.

In one word, he gets them by drawing upon his capital.

‘Capital’ is an ancient word that in the old times had nothing to do with capitalism. At the dawn of civilisation, a man’s capital was the number of heads (caput in Latin) of livestock he owned: in fact, his cattle. Pecunia, Latin for money, derives from pecus, sheep. In ancient Rome, a proletarius was a man without capital, whose only property was his children (proles), who in due course would contribute to the family income and hopefully take care of parents in their old age.

Nowadays our capital is much better assorted. We have cash and other financial assets; real estate and other property; entitlements and other rights that come to us from being members of a community. Most forms of capital earn a return: as cows give milk, sheep give wool, and both give offspring, financial assets give interest and dividends. Real estate gives a rent or, if used by its owner, saves him from paying one. Entitlement capital earns health insurance, pensions and other benefits.

But, now as of yore, the largest part of most people’s capital – proletarians included – is their own caput: themselves. People earn a return by using or lending their labour services. For example, in return for his labour a teacher receives a salary. In this sense, we can say that labour income is the return of Human Capital.

Returns can be spent or saved to accumulate capital. Our teacher spends part of his salary to satisfy his needs and saves the rest to increase his capital. He does the same with the returns of his non-human capital: interest, dividends, rents etc. At the same time, capital increases or decreases, due to changes in its value.

The value of any form of capital is not the amount of labour required to produce it, but the discounted sum of its expected payoffs. This is clear for financial assets, but it is not different for human capital. So, for example, the value of a teacher’s human capital is the discounted sum of his expected salaries, from now until retirement. His salary will, by and large, depend on his technique, i.e. his knowledge and skills as well as any other quality that affects the demand for his services. An unexpected promotion will therefore increase the value of his human capital, and a firing will decrease it. Like any other asset, human capital has a trade-off between risk and return: at the same level of technique, the job of a tenured academic has a lower expected risk and a lower expected return than the job of an investment banker.

There is however a major difference: while financial assets, real estate and other property can be exchanged for cash at a price approximating their estimated value, human capital can’t. Since the end of slavery, one can rent people’s labour but cannot buy people! So, while salaries have a market value, human capital doesn’t. One can draw upon his non-human capital to satisfy his needs, but cannot sell himself (at least not in an economic sense). The only decision one can take is whether to offer his labour services in the job market or keep his energy and time for himself.

Most people don’t have that privilege: their non-human capital is not sufficient to satisfy their needs. They need to work. Of course, labour has its own virtues: in addition to a salary, a worker gets further education, increased skills, social recognition, personal satisfaction and other perks. But he doesn’t have a choice.

Only a few people have enough non-human capital to afford the choice. They may still decide to work – to reap labour’s virtues, or because they deem it necessary in order to preserve the value of their capital: the cattle would die without the farmer’s labour; and so would the firm without its owner’s guidance. Still, it is their choice: owners could rent other people’s labour to do the job.

This is, in effect, what shareholders do. The owner of a share of IBM does nothing to earn his return. Don’t be swayed by the incidental work that he may or may not do to decide to own the share. The dividend he gets is his return as a shareholder, i.e. as a supporter of the equity of an enterprise. Likewise, a bondholder return is the interest he gets from lending his capital; and a real estate owner return is the rent he gets in exchange for the housing service he supplies to the tenant.

Ownership of non-human capital may well have resulted from past accumulation of human capital returns. As our teacher saved part of his salary, he might have bought IBM shares. But ownership may also be inherited. True, going back in time we may well be able to ascribe all capital to someone’s past labour. In this sense, labour is, ultimately, everything: as Lincoln put it, capital is the fruit of labour and would not exist without it. However, this is a pleasing but irrelevant point. The relevant point is that current and future returns to non-human capital are not a reward to labour. They are a reward to risk bearing, to lending, to housing.

Keynes was right: there is only one factor of production. But it is not labour, it is capital – both human and non-human. Human capital is the inalienable property of individuals: the workers. Non-human capital can be owned and exchanged by individuals – directly or indirectly through companies and other forms of private association – or, to a larger or smaller extent, by the State. But whoever owns it, the supply of non-human capital is as important to the economy as the supply of labour services.

Italy is a democratic Republic founded on capital.

Nah, that wouldn’t have worked either. But it’s true. All countries are.

The US Constitution starts with a Preamble:

We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America.

followed by Article 1:

All legislative Powers herein granted shall be vested in a Congress of the United States, which shall consist of a Senate and House of Representatives.

British people do away with grandiloquence and have no single constitutional law. Unlike the French, whose current Constitution starts with an even loftier Preamble:

The French people solemnly proclaim their commitment to human rights and the principles of national sovereignty as defined by the Declaration of 1789, confirmed and complemented by the Preamble to the 1946 Constitution and the rights and duties set out in the Charter for the Environment 2004.
Under these principles and that of self-determination of peoples, the Republic offers to overseas territories that express the will to adhere to them new institutions founded on the common ideal of liberty, equality and fraternity and conceived with a view to their democratic evolution.

and moves on to a grand Article 1:

France is an indivisible, secular, democratic and social. It guarantees equality before the law for all citizens without distinction of origin, race or religion. It shall respect all beliefs. His organization is decentralized.
The law favors the equal access of women and men to electoral mandates and elective offices, as well as professional and social responsibilities.

Not to be outdone, the Preamble of the Basic Law for the Federal Republic of Germany, recites:

Conscious of their responsibility before God and man, inspired by the determination to promote world peace as an equal partner in a united Europe, the German people, in the exercise of their constituent power, have adopted this Basic Law. Germans in the Länder of Baden-Württemberg, Bavaria, Berlin, Brandenburg, Bremen, Hamburg, Hesse, Lower Saxony, Mecklenburg-Western Pomerania, North Rhine-Westphalia, Rhineland-Palatinate, Saarland, Saxony, Saxony-Anhalt, Schleswig-Holstein and Thuringia have achieved the unity and freedom of Germany in free self-determination. This Basic Law thus applies to the entire German people.

and Article 1 proclaims:

Human dignity shall be inviolable. To respect and protect it shall be the duty of all state authority.
The German people therefore acknowledge inviolable and inalienable human rights as the basis of every community, of peace and of justice in the world.
The following basic rights shall bind the legislature, the executive and the judiciary as directly applicable law.

What about Italy? What do the Italian people solemnly proclaim? Justice, Liberty, Welfare? What are their supreme goals? Liberté, Egalité, Fraternité? What is our Constitution founded on? God and man, human dignity? None of that. Italy is founded on labour.

On what?

Yes, labour. Article 1 of the Italian Constitution (no mucking about with preambles) says:

Italy is a democratic Republic founded on labour.
Sovereignty belongs to the people and is exercised by the people in the forms and within the limits of the Constitution.

What a weird choice. There are many good things that can be said about the virtues of labour, but not that it is a universal principle, an ultimate aspiration or a noble ideal. So what on earth did the Italian constituents have in mind? There is a simple explanation. After the fall of the Fascist regime and the end of the war, the new Italian Parliament was dominated by Christian Democrats, Socialists and Communists, who together held more than three quarters of the seats in the Constituent Assembly. The adopted formulation was proposed as a compromise by the Christian Democrats, after the Socialists and Communists’ proposal – ‘Italy is a democratic workers’ Republic’ – had been turned down by twelve votes!

Like many to follow, it was a botched conciliation. What can ‘founded on labour’ possibly mean? Italian constitutionalists are not short of valiant explanations, centred on labour’s unquestioned ethical value. But any attempt to elevate it to a founding principle is ultimately an artifice.

A reasonable constitutional goal is full employment, as well expressed by Article 4:

The Republic recognises the right of all citizens to work and shall promote such conditions as will make this right effective.
Every citizen has the duty, according to capability and choice, to perform an activity or function that contributes to the material or spiritual progress of society.

But when combined with Article 1, a legitimate goal has often been transformed into an unreasonable demand – to be given a job or to hold on to one, no strings attached.

A workers’ Republic was not a great idea. But a Republic founded on labour means nothing – or anything you like.

As evidence accumulates, it may result in proving a hypothesis true or false, irrespective of prior odds. When the evidential tug of war has a winner, prior odds are no longer relevant. No matter our starting belief, we are 100% convinced that the sun will rise tomorrow. As four or more accurate coaches concur in calling a child a football champion, his father can be rightfully confident that, however unlikely at the start, the hypothesis that his child is a champion is very well supported.

Since evidence accumulates multiplicatively, the tug of war can also be won thanks to even a single piece of conclusive evidence annihilating all other evidence as well as any prior odds. As Tony opened the door, he proved himself conclusively guilty.

But evidence does not always lead to the truth. The tug of war does not always have a winner: it can remain stuck somewhere in the middle, where all we can say is that the hypothesis is probably true, and therefore also probably false. In that case, our beliefs continue to be influenced by our priors.

Dependence on prior beliefs is an inconvenient obstacle to the pursuit of truth. Ideally, we would like evidence to speak for itself, swamp priors and give us certainty. But when evidence is not so obliging, ignoring priors, or pretending they do not exist, is not the right course of action: it is the Prior Indifference Fallacy – the assumption, most often wrong, that the hypothesis under investigation has a 50% prior probability of being true.

Prior indifference is not only the fallacy of hopeful fathers, duped lovers and swayed investors. It is also the error made by statisticians who, by ignoring priors (i.e. setting BO=1), identify Posterior Odds with the Likelihood Ratio: PO=LR. As Bruno de Finetti put it:

Tracing it back to Bayes’s theorem, what goes wrong is that those who do not wish to use it in a legitimate way – on account of certain scruples – have no scruples at all about using it in a manifestly illegitimate way. That is to say, they ignore one of the factors (the prior probability) altogether, and treat the other (the likelihood) as though it in fact meant something other than it actually does. This is the same mistake as is made by someone who has scruples about measuring the arms of a balance (having only a tape-measure at his disposal, rather than a high precision instrument), but is willing to assert that the heavier load will always tilt the balance (thereby implicitly assuming, although without admitting it, that the arms are of equal length!). (Theory of Probability, Volume 2, p. 248).

The typical hypothesis of a statistical model is that some parameter has a certain value. The hypothesis is tested in the light of some evidence, consisting of a set of data. TPR=P(E|H) is the probability of the evidence in case H is true, i.e. in case the parameter has the specified value; and FPR=P(E|not H) is the probability of the evidence in case H is false. Hence the Likelihood Ratio LR=TPR/FPR measures how much more or less likely it is to observe the data in case the parameter has the specified value, compared to the case where it doesn’t.

Bayes’ Theorem says that the odds that H is true in the light of the evidence equal the Likelihood Ratio times the prior odds that H is true. Ignoring prior odds, and equating posterior odds to the Likelihood Ratio, is only appropriate if the accumulated evidence leads to the truth, i.e. if there is an overwhelming amount of confirmative or disconfirmative evidence that cumulatively proves H true or false: the parameter certainly has or certainly does not have the specified value.

This implicit expectation of convergence depends in turn on the assumption that the parameter is ‘out there’, reflecting an ‘objective’ feature of reality. In that case, irrespective of prior odds, it either has or does not have the specified value (assuming for simplicity that it can take one of a finite number of values. But with a few qualifications the same is true in the continuum). Therefore, tilting the balance one way or the other is only a matter of gathering enough data. Hence one might as well start from perfect ignorance and let evidence speak for itself.

But how much data is enough data? When does evidence become overwhelmingly confirmative or disconfirmative? Whatever our priors, we certainly have overwhelming evidence to prove tomorrow’s sunrise. However small the father’s initial priors, his child will almost certainly become a football champion if ten accurate coaches say so. But what if the father asks only one or two coaches? Or what if he asks ten coaches but they express divergent views, so that the product of their Likelihood Ratios is neither very large nor zero? In those circumstances the father’s priors do matter, and it does matter that they are, however approximately, correct. If he starts with very low priors, the father will correctly conclude that inadequately supported or unsettled views should leave him sceptical about the chances of his son’s success. But if, blinded by evidence, he neglects the Base Rate and becomes prior indifferent, his posterior odds will be grossly overstated.

Likewise, the available data may be scarce or ambiguous, and therefore insufficient for a precise estimate of the model’s parameter. In such a case, a correct prior probability assigned to the hypothesis that the parameter has the specified value is key to a proper evaluation of the probability that it actually does. And prior indifference can be as misleading. This works both ways: a low prior probability will require abundant and convergent evidence; but if the prior is high, less and rougher evidence may suffice.

A mistaken notion of the goals of scientific inquiry rejects prior dependence as subjective and therefore ‘unscientific’. But there is really nothing more to it than Laplace’s dictum: Extraordinary claims require extraordinary evidence. Whence its corollary: Ordinary claims require ordinary evidence. We are readily disposed to believe that Uri Geller can bend spoons with his hands; but when he says he does it with his mind we want to look a bit closer.

Besides, parameters are not always ‘out there’: they often are just an attribute of our representation of reality. This is definitely the case in economics: there is no such a thing as the Marginal Propensity to Consume, the Coefficient of Relative Risk Aversion, or the Weighted Average Cost of Capital. So the expectation that, given enough data, we can certainly discover their true value is not warranted. And the probability of the hypothesis that some parameter has a specified value may not necessarily converge to one of the two boundaries of the probability spectrum, but may stay in the middle and, as such, remain dependent on our priors.

This is not an inconvenience: it is the natural state of scientific inquiry, whose ethos is to be comfortable with uncertainty and remain open to evidence-led reversals of any established truth. Examples abound. Just to pick one of the latest:

After 35 years as dietary gospel, the idea that saturated fat is bad for your heart appears to be melting away like a lump of butter in a hot pan.

Prior dependence is inherent to science, not alien to it. Ignoring it is tempting but wrong. Prior indifference does not eliminate priors: as Irving J. Good put it, it just SUTC them: Sweep Under The Carpet (Good Thinking, p. 23).

Rather than pretending they don’t exist, statisticians should try to get their priors right.