# I’m Back, A New Harry Long Strategy, And Plans For Hypothesis-Driven Development

I’m back. Anyone that wants to know “what happened at Graham”, I felt there was very little scaffolding/on-boarding, and Graham’s expectations/requirements changed, though I have a reference from my direct boss, an accomplished quantitative director In any case, moving on.

Harry Long recently came out with a new strategy posted on SeekingAlpha, and I’d like to test it for robustness to see if it has merit.

Here’s the link to the post.

So, the rules are fairly simple:

ZIV 15%
SPLV 50%
TMF 10%
UUP 20%
VXX 5%

TMF can be approximated with a 3x leveraged TLT. SPLV is also highly similar to XLP — aka the consumer staples SPY sector. Here’s the equity curve comparison to prove it.

So, let’s test this thing.

```require(PerformanceAnalytics)
require(quantmod)

getSymbols('XLP', from = '1900-01-01')
getSymbols('TLT', from = '1900-01-01')
getSymbols('UUP', from = '1900-01-01')

strat <- Return.portfolio(symbols, weights = c(.15, .5, .1, .2, .05), rebalance_on='years')
```

Here are the results:

```compare <- na.omit(cbind(strat, Return.calculate(Ad(XLP))))
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
```

Equity curve (compared against buy and hold XLP)

Statistics:

```                          portfolio.returns XLP.Adjusted
Annualized Return                 0.0864000    0.0969000
Annualized Std Dev                0.0804000    0.1442000
Annualized Sharpe (Rf=0%)         1.0747000    0.6720000
Worst Drawdown                    0.1349957    0.3238755
Calmar Ratio                      0.6397665    0.2993100
```

In short, this strategy definitely offers a lot more bang for your risk in terms of drawdown, and volatility, and so, offers noticeably higher risk/reward tradeoffs. However, it’s not something that beats the returns of instruments in the category of twice its volatility.

Here are the statistics from 2010 onwards.

```charts.PerformanceSummary(compare['2010::'])
rbind(table.AnnualizedReturns(compare['2010::']), maxDrawdown(compare['2010::']), CalmarRatio(compare['2010::']))

Annualized Return                0.12050000    0.1325000
Annualized Std Dev               0.07340000    0.1172000
Annualized Sharpe (Rf=0%)        1.64210000    1.1308000
Worst Drawdown                   0.07382878    0.1194072
Calmar Ratio                     1.63192211    1.1094371
```

Equity curve:

Definitely a smoother ride, and for bonus points, it seems some of the hedges helped with the recent market dip. Again, while aggregate returns aren’t as high as simply buying and holding XLP, the Sharpe and Calmar ratios do better on a whole.

Now, let’s do some robustness analysis. While I do not know how Harry Long arrived at the individual asset weights he did, what can be tested much more easily is what effect offsetting the rebalancing day has on the performance of the strategy. As this is a strategy rebalanced once a year, it can easily be tested for what effect the rebalancing date has on its performance.

```yearlyEp <- endpoints(symbols, on = 'years')
rebalanceDays <- list()
for(i in 0:251) {
offset <- yearlyEp+i
offset[offset > nrow(symbols)] <- nrow(symbols)
offset[offset==0] <- 1
wts <- matrix(rep(c(.15, .5, .1, .2, .05), length(yearlyEp)), ncol=5, byrow=TRUE)
wts <- xts(wts, order.by=as.Date(index(symbols)[offset]))
offsetRets <- Return.portfolio(R = symbols, weights = wts)
colnames(offsetRets) <- paste0("offset", i)
rebalanceDays[[i+1]] <- offsetRets
}
rebalanceDays <- do.call(cbind, rebalanceDays)
rebalanceDays <- na.omit(rebalanceDays)
stats <- rbind(table.AnnualizedReturns(rebalanceDays), maxDrawdown(rebalanceDays))
stats[5,] <- stats[1,]/stats[4,]
```

Here are the plots of return, Sharpe, and Calmar vs. offset.

```plot(as.numeric(stats[1,])~c(0:251), type='l', ylab='CAGR', xlab='offset', main='CAGR vs. offset')
plot(as.numeric(stats[3,])~c(0:251), type='l', ylab='Sharpe Ratio', xlab='offset', main='Sharpe vs. offset')
plot(as.numeric(stats[5,])~c(0:251), type='l', ylab='Calmar Ratio', xlab='offset', main='Calmar vs. offset')
plot(as.numeric(stats[4,])~c(0:251), type='l', ylab='Drawdown', xlab='offset', main='Drawdown vs. offset')
```

In short, this strategy seems to be somewhat dependent upon the rebalancing date, which was left unsaid. Here are the quantiles for the five statistics for the given offsets:

```rownames(stats)[5] <- "Calmar"
apply(stats, 1, quantile)
```
```     Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%) Worst Drawdown    Calmar
0%            0.072500             0.0802                  0.881000      0.1201198 0.4207922
25%           0.081925             0.0827                  0.987625      0.1444921 0.4755600
50%           0.087650             0.0837                  1.037250      0.1559238 0.5364758
75%           0.092000             0.0843                  1.090900      0.1744123 0.6230789
100%          0.105100             0.0867                  1.265900      0.1922916 0.8316698
```

While the standard deviation seems fairly robust, the Sharpe can decrease by about 33%, the Calmar can get cut in half, and the CAGR can also vary fairly substantially. That said, even using conservative estimates, the Sharpe ratio is fairly solid, and the Calmar outperforms that of XLP in any given variation, but nevertheless, performance can vary.

Is this strategy investible in its current state? Maybe, depending on your standards for rigor. Up to this point, rebalancing sometime in December-early January seems to substantially outperform other rebalance dates. Maybe a Dec/January anomaly effect exists in literature to justify this. However, the article makes no mention of that. Furthermore, the article doesn’t explain how it arrived at the weights it did.

Which brings me to my next topic, namely about a change with this blog going forward. Namely, hypothesis-driven trading system development. While this process doesn’t require complicated math, it does require statistical justification for multiple building blocks of a strategy, and a change in mindset, which a great deal of publicly available trading system ideas either gloss over, or omit entirely. As one of my most important readers praised this blog for “showing how the sausage is made”, this seems to be the next logical step in this progression.

Here’s the reasoning as to why.

It seems that when presenting trading ideas, there are two schools of thought: those that go off of intuition, build a backtest based off of that intuition, and see if it generally lines up with some intuitively expected result–and those that believe in a much more systematic, hypothesis-driven step-by-step framework, justifying as many decisions (ideally every decision) in creating a trading system. The advantage of the former is that it allows for displaying many more ideas in a much shorter timeframe. However, it has several major drawbacks: first off, it hides many concerns about potential overfitting. If what one sees is one final equity curve, there is nothing said about the sensitivity of said equity curve to however many various input parameters, and what other ideas were thrown out along the way. Secondly, without a foundation of strong hypotheses about the economic phenomena exploited, there is no proof that any strategy one comes across won’t simply fail once it’s put into live trading.

And third of all, which I find most important, is that such activities ultimately don’t sufficiently impress the industry’s best practitioners. For instance, Tony Cooper took issue with my replication of Trading The Odds’ volatility trading strategy, namely how data-mined it was (according to him in the comments section), and his objections seem to have been completely borne out by in out-of-sample performance.

So, for those looking for plug-and-crank system ideas, that may still happen every so often if someone sends me something particularly interesting, but there’s going to be some all-new content on this blog.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

## 25 thoughts on “I’m Back, A New Harry Long Strategy, And Plans For Hypothesis-Driven Development”

1. Skeptic says:

Its almost as if he googles for ‘what etf did best over the past 3 years’ and then just throws them all together.

2. I blogged similarly about ‘offsets’ today too. Yearly rebalances just asking for that kinda analysis / ass kicking ; )

• Any guesses as to why? Do all things do better when rebalanced on January?

3. Ilya/John,

I love this new direction, ie. examining performance on different rebalancing dates for non-daily strategies. This is an excellent start in terms of robustness testing.

Of course, the original post is based on 1 decision and n=3 rebalances based on what is almost certainly a datamined universe and ex-post MV optimal weights (or very close).

Ridicolo!

• I myself also wonder how the weights got arrived at.

• That said, there are like 8 annual rebalancing dates. But yeah, definitely a lot of variability in rebalancing date. January effect at play?

There are n=7 rebalances in your tests, but still only one decision. Also, to get an even better sense of the character one might consider varying the rebalance dates randomly using rnorm(252,21).

Am I right that the median Sharpe across all rebalance dates is lower than the Sharpe of the buy and hold strategy for the original ETF? This despite the opportunity to MV optimize the weights based on ex post data?

Ridicolo!!

>

4. Sorry, tough to see whether your latter robustness tests apply to the period 2007:: or 2010::. If they are from 2007:: then median Sharpe is better than B&H, but the rest of my complaints stand, ie. single decision, based on ex post data, etc.

It’s from 2007. 2010 was just to see how it did post-crash.

5. Glad you are back but sorry for you that Graham did not work out. Very much enjoyed your MVO post and looking at it in Python/ Zipline. Working for others is……er….less than satisfactory. Having worked for myself since 1992 I can’t imagine the horrors of trying to fit in at Graham…or anywhere else for that matter!

6. Pingback: Best Links of the Week | Quantocracy

7. If I may ask, did you sign a contract when you got your job?

“While I do not know how Harry Long arrived at the individual asset weights he did, ”

The answer is obvious: selection bias and curve-fitting using trial-and-error through a website service called ETFreplay.com. Besides the change in rebalancing days, try changing starting days.

There is simply not enough info here to have a representative sample to make inferences. For that you would need a long history that simply does not exist. These allocation strategies remind me of the MACD, RSI and MA strategies of technical analysts in the 1990s. They appeal to those who are ignorant of the effects of selection and data-mining bias.

• Of course there was a contract. But the terms only applied so long as I was employed at Graham.

The charts *are* about changing the starting days. As for not long enough history, there can never be enough of a history =P.

• Maybe I did not express my question correctly. You wrote:

“Graham’s expectations/requirements changed,”

My question was about the contract. Did the contract cover the expectations/requirements or it was vague. Just curious about current state of affairs in the job market.

“there can never be enough of a history ”

Yes but this is not a justification for using too little of a history.

• Oh, the contract was absolutely vague in terms of the leeway. At the end of the day, it was at will, so the moment they didn’t want to go in my direction anymore, they didn’t.

8. Ilya, it’s good to see you back posting 🙂
BTW – I’ve started a blog so please check it out.
http://www.mintegration.eu