‘Smart Beta’ 5: No Alpha Here

May 02, 2014

'Smart-beta' returns come from excess risk, not from some magic alpha.

This blog is the fifth installment of a series transforming our ideas about "smart beta." Part 1 started with the proposition that defining smart beta in an ETF context is essentially impossible. Part 2 laid out the ground rules to prove the point; Part 3 sunk noncap weighting as a method to categorize smart beta; and Part 4 took a wrecking ball to the notion that factor-focused tilts were synonymous with smart beta.

 

Wear black today, because we’re going to a funeral. I’m going to bury the term “smart beta” for good.

I’ve been preparing you for this grim end with a series of blogs in which I’ve proven, step by step, that there’s no definition for smart beta that works consistently in an ETF context. I’ve saved the death blow for today.

In the ground-rules blog—No. 2 in this series—where I explained what a definition of smart beta should do, I listed seven possible definitions of smart beta. Then I started showing why each of the seven fails to meet basic ground rules.

So far, I’ve laid to rest five of the seven definitions: transparency, rules-following and thematic exposure in Part 2; alternative weighting in Part 3; and factor exposure in Part 4.

Today I’ll take on the last two: superior risk-adjusted returns; and improved diversification.

No. 7, diversification, is a mere trifle. I’ll save it for the end of this blog for my most-patient readers. I know you’re all dying for me to dive in to the real heart of the matter: excess risk-adjusted returns. So let’s get to it.

The seductive promise of active management—that your “above-average” manager comes from Lake Wobegon and can consistently produce risk-adjusted outperformance—has invaded the world of passive investing in the form of smart beta. Why else would anyone go to the trouble and expense of rearranging a plain-vanilla index?

In the world of investments, the holy grail, the big payoff, is excess risk-adjusted returns. If a strategy—active or passive—produces more returns than it ought to given the risks taken, then it’s a home run. Like the owners of the New York Yankees, investors are willing to pay dearly for the possibility of a free lunch. (Can you tell I’m a steeped-in-“Moneyball” Oakland A’s fan?)

Even Morningstar, acknowledging the spirit of smart beta, is now categorizing what it calls “strategic beta” strategies according to whether they promote outperformance or promise risk reduction. Morningstar’s researchers know damn well to evaluate performance and risk jointly (see under: Five Star fund).

So, what they’re really saying is that smart-beta funds promise outperformance with no increased risk, or at least, normal performance with reduced risk. Either way, this adds up to promising risk-adjusted excess returns.

Anyone—even my Uncle Milton—can promise that a strategy will produce risk-adjusted outperformance. But can they deliver? My Uncle Milton is long gone, but smart indexing is on a roll, gathering assets and lots of press. These smart-beta funds have left behind a performance record. So there’s something we can test. Shall we?

A first-rate ETF database, fund marketing material and some statistical know-how are all we need to find out how these strategies have lived up to their promises.

I used ETF.com’s database to test 11 widely held U.S. large-cap ETFs with complex strategies, whose marketing material suggests these funds will outperform on a risk-adjusted basis.

 

Six Feet Under

Since I already mentioned we’re going to a funeral, you can guess how these tests came out.

Whether you look at one-, three- or five-year performance, these 11 U.S. large-cap smart-beta funds have produced returns in line with their risks. No more, no less. Whether you look for statistically significant alpha or Sharpe ratios, there’s virtually no risk-adjusted outperformance to back up the marketing claims.

I’ll need a few pages to walk you through these tests and their results. I’ll explain to you what these tests do, why they matter and, most importantly, how to understand the results in an uncertain world.

It’s best to compare apples to apples, so I’ll focus most of my tests on a single equity segment, but I’ll branch out to look at all (non-geared) equity funds before we’re done. Ready?

Complex strategies often debut in the crowded U.S. large-cap space; by now, plenty of smarty-pants U.S. large-cap funds have five years of returns history. The U.S. large-cap segment provides plenty of data to test claims that “smart beta” funds generate risk-adjusted excess returns.

We’ll measure the risks these funds have taken and the returns they’ve earned relative to a plain-vanilla benchmark through March 31, 2014. I’ll use the MSCI USA Large Cap Index. (You can read why I chose MSCI here.)

I tested one-year, three-year and, where available, five-year fund total return net asset values against gross total index returns. I’ll walk you through the math you need to understand for the regression results and the Sharpe ratios in the tables below.

The funds appear in order of their total annualized return, from high to low. The MSCI USA Large Cap benchmark data appears in bold.

 

 

1-Year Returns Analysis For Headline Smart-Beta Funds
Ticker Fund Total Return Goodness of Fit Beta Alpha (95% Probability of Significance in black, 90% in gray) Sharpe Ratio Probability that the Sharpe Ratio is Statistically Different From the Benchmark
SPHB PowerShares S&P 500 High Beta 33.4% 0.86 1.40 -- 0.10 54%
FEX First Trust Large Cap Core AlphaDEX 25.2% 0.95 1.10 -- 0.11 50%
EWRI Guggenheim Russell 1000 Equal Weight 25.1% 0.92 1.06 -- 0.11 52%
RSP Guggenheim S&P 500 Equal Weight 25.0% 0.96 1.07 -- 0.11 52%
CSM ProShares Large Cap Core Plus 24.8% 0.98 1.01 -- 0.12 58%
RWL RevenueShares Large Cap 23.8% 0.97 1.00 -- 0.11 55%
PRF PowerShares FTSE RAFI US 1000 23.3% 0.98 1.04 -- 0.11 52%
EPS WisdomTree Earnings 500 22.8% 0.99 0.99 -- 0.11 54%
SPHQ PowerShares S&P 500 High Quality 21.5% 0.94 0.95 -- 0.11 51%
BENCHMARK MSCI USA Large Cap Index 21.5%       0.11  
DLN WisdomTree LargeCap Dividend 18.0% 0.97 0.92 -- 0.10 56%
SPLV PowerShares S&P 500 Low Volatility 12.7% 0.81 0.84 -- 0.07 72%

 

3-Year Returns Analysis For Headline Smart-Beta Funds
Ticker Fund Total Return Goodness of Fit Beta Alpha (95% Probability of Significance in black, 90% in gray) Sharpe Ratio Probability that the Sharpe Ratio is Statistically Different From the Benchmark
SPHB PowerShares S&P 500 High Beta * 0.92 1.62 -11.1% 0.02 78%
SPHQ PowerShares S&P 500 High Quality 16.3% 0.96 0.91 3.0% 0.06 62%
RWL RevenueShares Large Cap 15.9% 0.98 1.04 -- 0.05 52%
DLN WisdomTree LargeCap Dividend 15.5% 0.98 0.87 2.8% 0.06 62%
RSP Guggenheim S&P 500 Equal Weight 15.3% 0.97 1.12 -- 0.05 54%
PRF PowerShares FTSE RAFI US 1000 15.3% 0.99 1.05 -- 0.05 50%
EPS WisdomTree Earnings 500 15.3% 1.00 1.00 -- 0.05 54%
CSM ProShares Large Cap Core Plus 15.2% 0.99 1.03 -- 0.05 51%
EWRI Guggenheim Russell 1000 Equal Weight 15.0% 0.95 1.11 -- 0.05 55%
FEX First Trust Large Cap Core AlphaDEX 14.4% 0.97 1.10 -- 0.05 56%
BENCHMARK MSCI USA Large Cap Index 14.3%       0.05  
SPLV PowerShares S&P 500 Low Volatility * 0.85 0.70 -- 0.07 70%

*SPLV and SPHB launched in May 2011. Their statistics are based on 35 months of performance, not 36. All returns and alphas are annualized.

 

5-Year Returns Analysis For Headline Smart-Beta Funds
Ticker Fund Total Return Goodness of Fit Beta Alpha (95% Probability of Significance in black, 90% in gray) Sharpe Ratio Probability that the Sharpe Ratio is Statistically Different From the Benchmark
RSP Guggenheim S&P 500 Equal Weight 24.5% 0.96 1.17 -- 0.07 58%
PRF PowerShares FTSE RAFI US 1000 24.3% 0.94 1.16 -- 0.07 57%
FEX First Trust Large Cap Core AlphaDEX 22.9% 0.96 1.12 -- 0.07 55%
RWL RevenueShares Large Cap 21.9% 0.98 1.07 -- 0.07 57%
DLN WisdomTree LargeCap Dividend 20.0% 0.97 0.92 2.5% 0.08 63%
EPS WisdomTree Earnings 500 19.9% 0.99 1.00 -- 0.07 56%
BENCHMARK MSCI USA Large Cap Index 18.7%       0.07  

All returns and alphas are annualized. All data as of March 31, 2014.

 

There’s enough data there to fill the River Styx. Like Charon, ancient Greece’s mythical ferryman to the underworld, I’ll guide you through it.

The first critical test is goodness of fit, which will indicate if it is fair to compare the fund to our MSCI benchmark. Goodness of fit measures co-movement: the frequency with which the benchmark and the fund both gain (or lose) value on the same day. A reading of 1.00, or 100 percent, is as good as it gets.

We can see that these funds fit our benchmark well, with all but three of our 28 tests synching with the MSCI USA Large Cap index on at least 90 percent of trading days, and with three-fourths of our test funds hitting 95 percent co-movement. The MSCI USA Large Cap Index is a fair and well-fitting benchmark for these 11 funds.

Now we can look at our first measure of risk: beta. When a fund has the same risk level as a benchmark, its beta equals 1.00. Higher betas mean more volatility relative to the benchmark. More volatility means more risk.

Look again at the tables above. The funds are sorted by returns, but they might as well be sorted by betas, because the results would be mostly the same, except in the three-year table. More risk equals greater returns in a rising market. No magic here.

When goodness of fit is high, you can multiply the benchmark’s returns by the fund’s beta to find the fund’s predicted return. The difference between the predicted return and the actual return is called alpha. But do be careful about alphas, because, like any statistic, they have a margin of error.

Most of the time, alphas don’t mean anything, because their error bands are wide enough that an alpha of zero, or even of the opposite sign, is within the margin of error. If you don’t think margins of error matter, you must have forgotten the 2012 presidential election.

Statisticians look at the margin of error to determine the likelihood of a result being non-random. The term of art is “significance.” An alpha’s significance is the probability that its error bars don’t include zero.

When statisticians argue over which level of significance to use, they’re debating how wide to draw the error bars. Most require the error bars to be about two standard errors (standard deviation divided by the square root of sample size) wide, or a 95 percent probability that zero is not within the error bars. Generous statisticians allow for 90 percent significance to confirm excess returns. That’s 1.66 standard errors.

Anything inside the error bars is noise, and not statistically different from zero.

 

If static hurts your ears, I recommend headphones for this next section. Because when you check the one-year, three-year and five-year alpha significance from our 11 funds, you’ll find a whole lot of noise. And precious little statistically significant alpha. Oh, and of the two alphas that do pass muster, one is strongly negative.

Let me repeat that: These headline smart-beta funds had no statistically significant risk-adjusted outperformance on a one-, three- or five-year basis. They produced no excess returns, except in two instances (out of 28 tests on 11 funds).

One of those two instances was a huge blooper: negative 11.1 percent per year. You’ll find this in the three-year table, along with Wisdom Tree Large Cap Dividend Fund’s (DLN | A-95) 2.8 percent success.

Even at the most generous threshold of 90 percent significance, only four funds generated meaningful alpha. DLN extended its run to the five-year window, and the PowerShares S&P High Quality (SPHQ | A-78) generated 3 percent per year of excess returns over the three-year period. It’s only fair to note that I couldn’t test SPHQ for the five-year period because it changed its underlying index in 2010. The current version of SPHQ is not quite four years old.

Alphawise, DLN is the only smart-beta fund in our sample to produce long-term risk-adjusted outperformance. The other 10 failed to do so.

It Gets Worse

Sharpe ratios, though aligned with alphas, tell an even more drastic story.

Like alphas, Sharpe ratios have a margin of error. To figure out their significance, you have to test whether a fund’s Sharpe ratio is different from the benchmark’s. Again, it’s about error bars.

Not a single U.S. large-cap headline smart-beta fund produced a Sharpe ratio that’s statistically different from the benchmark—not in one year, three years or five years. That’s zero risk-adjusted outperformance for 11 headline smart-beta funds over the past five years, according to Sharpe ratios.

I wondered if things might be different outside of the U.S. large-cap segment, so I did a quick experiment.

Using ETF.com’s Fund Finder and Analytics tool, I searched for equity funds with names that contain words associated with self-proclaimed smart-beta strategies: alpha, achievers, beta, dividend, dynamic, earnings, equal, factor, fundamental (or RAFI), income, momentum, quality, revenue, volatility and yield.

I checked how many had statistically significant alpha against a segment-appropriate benchmark, at the 95 percent confidence level.

The answer: fewer than would be expected by chance. Have a look:

 

 

  Number of Funds Significant Positive Alpha Significant Negative Alpha Percent Outperforming
One Year 193 2 2 1.0%
Three Years 128 4 4 3.1%
Five Years 118 7 3 5.9%

No matter where you look, or what statistics you use, these funds with catchy names and clever strategies are not working magic.

While claims of risk-adjusted outperformance are probably the most reliable indicator of smart-beta funds, actual fund performance has been in line with risks taken—and for the past five years.

One small note: ETF.com’s Analytics system benchmarks high-yield dividend funds against MSCI’s High Dividend Yield indexes. This comparison yielded three of the four significant negative three-year alphas and one of the three negative five-year alphas in the above table.

If actual risk-adjusted outperformance defines smart beta, then the smart-beta club will be exclusive indeed, with the vast majority of clever-sounding strategies barred at the gates.

Issuers who are applying smart-beta labels to their fund suites would surely object if they found those funds off a smart-beta list. Risk-adjusted outperformance as a smart-beta criterion creates groups that are not acceptable to the ETF community, violating one of the ground rules I established.

In other words, risk-adjusted outperformance fails as a smart-beta criterion. Since risk-adjusted outperformance is what investors actually care about, it’s pretty much “game over” for the term “smart beta.” Time to say your goodbyes, because the end will be quick.

Using our ground rules, we’ve sent six of seven smart-beta criteria to the trash heap. The seventh, diversification, is quite simple to eviscerate.

The final criterion, “improves portfolio diversification,” will fail in the same way as alternative weighting, factor exposure and risk-adjusted outperformance. Requiring smart-beta funds to have a portfolio that is more diversified than a vanilla benchmark will create fund groups that also are not acceptable to the ETF community.

To measure the extent to which adding a fund to a portfolio increases diversity, we need to know what’s in the portfolio. I can’t know what you hold, dear reader, so I cannot speak to the consequences to your portfolio of adding any fund.

The best I can do is to measure concentration within self-proclaimed smart-beta portfolios.

 

Portfolio Concentration

ETF.com’s Analytics service measures portfolio concentration using the Herfindahl ratio, which is the sum of the squared weights of each constituent. The Federal Trade Commission uses the Herfindahl index to judge if mergers or acquisitions would produce monopolistic conditions within an industry.

ETF.com uses it to measure portfolio concentration, and the higher the Herfindahl ratio, the more concentrated, and the less diversified the portfolio.

Look at the Herfindahl ratios for all the popular U.S. large-cap funds (assets > $50 million) with designer strategies. I’ve ranked funds from high to low concentration.

Ticker Fund Herfindahl Ratio
SPHD PowerShares S&P 500 High Dividend 2.11%
FTCS First Trust Capital Strength 2.00%
SDOG ALPS Sector Dividend Dogs 2.00%
NOBL ProShares S&P 500 Aristocrats 1.86%
DLN WisdomTree LargeCap Dividend 1.21%
SPHB PowerShares S&P 500 High Beta 1.02%
SPLV PowerShares S&P 500 Low Volatility 1.01%
QQEW First Trust NASDAQ-100 Equal Weighted 1.00%
EPS WisdomTree Earnings 500 0.97%
RWL RevenueShares Large Cap 0.91%
SPHQ PowerShares S&P 500 High Quality 0.89%
  MSCI USA Large Cap 0.88%
FNDX Schwab Fundamental U.S. Large Company 0.72%
PRF PowerShares FTSE RAFI US 1000 0.63%
EQL ALPS Equal Weight 0.52%
FEX First Trust Large Cap Core AlphaDEX 0.33%
RSP Guggenheim S&P 500 Equal Weight 0.20%
EWRI Guggenheim Russell 1000 Equal Weight 0.12%

These complex funds are generally less diverse than the benchmark.

Of the 17 funds in the sample, all but six are more concentrated then the MSCI USA Large Cap benchmark. The equal-weighted funds are the most diversified, while the dividend funds are surprisingly concentrated.

The PowerShares S&P High Dividend Portfolio (SPHD | A-39) has serious single-security risk, with more than 3 percent to at least three different holdings as of April 30, 2014. With this level of concentration, there’s no good evidence that smart-beta funds increase diversification. Put another way, if you required better-than-benchmark diversification as a criterion for sorting smart-beta funds, you would get a quite restricted list of smart-beta funds—one that cuts out almost all the dividend funds.

And that’s it. Smart-beta criterion No. 7 has fallen, along with its six compatriots. They all failed either because they didn’t make meaningful groupings, or because they made groups that are too controversial to gain acceptance in the ETF community.

The hearse is pulling up to the funeral home, with the term “smart beta” in the back. It died from exposure to logic, and by its failure to live up to its own billing.

For closure, here’s a recap of the seven smart-beta definitions and their shortcomings.

 

  1. Transparency—too broad, fails to make meaningful groups
  2. Rules based/quantitative—too broad, fails to make meaningful groups
  3. Thematic/specific segments or objectives—too broad, fails to make meaningful groups
  4. Noncap weighting—results are unacceptable to the ETF community, because too many oddball funds are included
  5. Captures risk premia/factor exposure—results are unacceptable to the ETF community because factor exposure appears in too many funds
  6. Superior risk-adjusted returns—results are unacceptable to the ETF community because only a tiny fraction of designer funds produce statistically significant excess returns
  7. Improves portfolio diversification—results are unacceptable to the ETF community because many designer funds are highly concentrated.

There is no No. 8. We’re done, and so is our hope of defining smart beta.

From now on, if we have to reference this undefinable term, when, for example, the Wall Street Journal asks us “How many smart-beta funds launched in 2014?” we will stigmatize it with quotation marks. Smart beta is dead. And “smart beta,” an occasionally necessary journalistic evil, will wear the cone of shame.

In my next blog, the final one of this series, I’ll lighten the mood with constructive suggestions for how to best describe “smart beta” strategies, from the plain vanilla to the intricately complex. And I’ll also talk about how ETF.com will handle questions from folks who want to talk about this undefinable trend.

This blog ends with a moment of silence, as we pay our respects to the term “smart beta.” We can’t count what we can’t define, and we can’t define smart beta in an ETF context.


At the time this article was written, the author held no positions in the securities mentioned. Contact Elisabeth Kashner at [email protected].

 

 

Find your next ETF

Reset All