Smart Beta’s Foundation Problem

March 15, 2018

Richard Wiggins is an academic who has long been involved in the Chartered Financial Analyst program and appears in various financial publications.

We trust our life savings to financial quants who impress us because we don’t understand them. Quantitative research provides a voluptuous cushion of reassurance, but what if it’s all based on bad science?

Factor-beta investors are full of passionate intensity, but we’ve read so many deceptive reports that we get misled into thinking we know something, when in fact we have no more insight into smart beta than a parrot into its profanities.

Little-known concepts like p-hacking and HARKing (hypothesizing after the results are known) raise uncomfortable questions, but few of us have heard of them; they’re neologisms that address some very serious epistemological problems behind the research driving smart-beta products to record sales. There’s strong evidence the whole smart-beta approach is anchored by Nobel prize-winning work that was wrong in the first place.

Alarming Results

The rise in “evidence-based investing” is exponential, but few people seem to be aware that there is an analog known as “evidence-based medicine”—and the results are alarming.

In 2011, German researchers in the drug company Bayer found in an extensive survey that more than 75% of published findings could not be validated. It gets worse. In 2012, scientists at the American drug company Amgen published the results of a study in which they selected 53 key papers deemed to be “landmark” studies and tried to reproduce them. Only six could be confirmed. This is not a trivial problem.

What do both “evidence-based” approaches have in common? 1) Reproducibility concerns; and 2) the backbone of their methodology is hypothesis significance testing and the computation of a p-value.

It is routine to look at a low p-value, like p=.01, and conclude there is only a 1% chance the null hypothesis is true—or that we have 99% confidence that the effect is true. Both of these interpretations are incorrect.

The p-value has always been easily misinterpreted and was never as reliable as many scientists presumed. It is often equated with the strength of a relationship, but the p-value reveals almost nothing about the magnitude and relative importance of an effect.

A Lot Of Published Research Is Wrong

Practically every modern textbook on scientific research methods teaches some version of this “hypothetico-deductive approach”; it is the gold standard of statistical validity (“Ph.D. standard”) and the ticket to getting into many journals. But we’ve been misapplying it to such an extent that legendary Stanford epidemiologist John Ioannidis wrote a famous paper titled, “Why Most Published Research Findings Are False.”

He wrote it almost 13 years ago, and it is the most widely cited paper that ever appeared in the journal that published it—and you’ve probably never heard of it.

There’s a “truthiness” to research, but it’s widely accepted among data analysts that “much of the scientific literature, perhaps half, may simply be untrue." It’s easy to find a study to support whatever it is you want to believe; but the greater the financial and other interests and prejudices, the less likely the findings are to be true.

When UK statistician Ronald Fisher introduced hypothesis significance testing and the p-value in the 1920s, he did not intend for it to be a definitive test. He was looking for an approach that could objectively separate interesting findings from background noise.


Find your next ETF