Every year we invite some of the investment industry’s most creative thinkers to speak about their work at the Research Affiliates’ Advisory Panel conference. Along with Nobel laureates Vernon Smith and Harry Markowitz, the speakers at our 14th annual meeting included Campbell Harvey, Richard Roll, Andrew Karolyi, Bradford Cornell, Andrew Ang, Charles Gave, Tim Jenkinson, and our very own Rob Arnott.1 The richness of the speakers’ presentations beggars any attempt to summarize them; I’ll limit myself to the points I found most intriguing and illuminating. I also acknowledge that this account may reflect my own capacity for misinterpretation as much as the genius of the speakers’ actual research.
Cam Harvey of Duke University’s Fuqua School of Business and the Man Group, who recently completed a 10 year stint as editor of the Journal of Finance, spoke about revising the traditional t-statistic standard to counter the industry’s collective data-snooping for new factors. Dick Roll presented a protocol for factor identification which helps classify a factor as either behavioral or risk-based in nature. These two topics are at the center of our research agenda (Hsu and Kalesnik, 2014; Hsu, Kalesnik, and Viswanathan, 2015).
Cam has written about the factor proliferation that has resulted from extensive data-mining in academia and the investment industry (Harvey, Liu, and Zhu, 2015; Harvey and Liu, 2015). As of year-end 2014 he and his colleagues turned up 316 supposed factors reported in top journals and selected working papers, with an accelerating pace of new discoveries (roughly 40 per year). Cam’s approach to adjusting the traditional t-stat is mathematically sophisticated but conceptually intuitive. When one runs a backtest to assess a signal that is, in fact, uncorrelated with future returns, the probability of observing a t-stat greater than 2 is 2.5%. However, when thousands upon thousands of such backtests are conducted, the probability of seeing a t-stat greater than 2 starts to approach 100%.
To establish a sensible criterion for hypothesis testing in the age of dirt-cheap computing power, we need to adjust the t-stat for the aggregate number of backtests that might be performed in any given year by researchers collectively. Recognizing that there are a lot more professors and quantitative analysts running a lot more backtests today than 20 years ago, Cam argued that a t-stat threshold of 3 is certainly warranted now. Applying this standard of significance, Cam also concluded that outside of the market factor, the other factors that seem to be pervasive and believable are the old classics: the value, low beta, and momentum effects. The newer anomalies are most likely results of datamining.
I am happy to note that at Research Affiliates we adopt an even more draconian approach to research. For example, Dr. Feifei Li requires a t-stat greater than 4 from our more overzealous junior researchers. Indeed, as we add to our research team and thus the number of backtests that we perform in aggregate, we recognize that our “false discovery” rate also increases meaningfully. We must and have developed procedures for establishing robustness beyond the simple t-stat.
Richard Roll, who was recently appointed Linde Institute Professor of Finance at Caltech, reminded us that there are essentially three types of factor strategies:
- Those that do not appear to be correlated with macro risk exposures yet generate excess returns
- Those that are correlated with macro risks and thus produce excess returns
- Those that seem to be correlated with sources of volatility but don’t give rise to excess returns