The ZOMMA Warthog Index

Harry Long posted another article on SeekingAlpha. As usual, it’s another “looks amazing at first glance, and winds up being disappointing compared to first impressions, but is still worth talking about” types of strategies. So here’s the strategy:

Rebalance annually:

50% XLP, 15% GLD, 35% TLT — aka a variation of the stocks/bonds portfolio, just with some gold added in. The twist is the idea that XLP is less risky than SPY. The original article, written by Harry Long, can be found here.

This strategy is about as simple as they come, and is tested just as easily in R.

Here’s the original replication:

require(PerformanceAnalytics)
require(quantmod)
getSymbols("XLP", from="1990-01-01")
getSymbols("GLD", from="1990-01-01")
getSymbols("TLT", from="1990-01-01")
prices <- cbind(Ad(XLP), Ad(GLD), Ad(TLT))
prices <- prices[!is.na(prices[, 2]),]
rets <- Return.calculate(prices)
warthogRets <- Return.portfolio(rets, weights=c(.5, .15, .35), rebalance_on = "years")
getSymbols("SPY", from="1990-01-01")
SPYrets <- Return.calculate(Ad(SPY))
comparison <- merge(warthogRets, SPYrets, join='inner')
charts.PerformanceSummary(comparison)

And the resulting replicated (approximately) equity curve:

So far, so good.

Now, let’s put the claim of outperforming over an entire cycle to the test.

years <- apply.yearly(comparison, Return.cumulative)
years <- years[-1,] #remove 2004
years
sum(years[,1] > years[,2])/nrow(years)
sapply(years, mean)

And here’s the output:

> years
           portfolio.returns SPY.Adjusted
2005-12-30        0.07117683   0.04834811
2006-12-29        0.10846819   0.15843582
2007-12-31        0.14529234   0.05142241
2008-12-31        0.05105269  -0.36791039
2009-12-31        0.03126667   0.26344690
2010-12-31        0.14424559   0.15053339
2011-12-30        0.20384853   0.01897321
2012-12-31        0.07186807   0.15991238
2013-12-31        0.04234810   0.32309145
2014-12-10        0.16070863   0.11534450
> sum(years[,1] > years[,2])/nrow(years)
[1] 0.5
> sapply(years, mean)
portfolio.returns      SPY.Adjusted 
       0.10302756        0.09215978 

So, even with SPY’s 2008 in mind, this strategy seems to break even on a year-to-year basis, and the returns are slightly better, which I suppose should be par for the course for a backtest against SPY through the recession.

Now, let’s just remove that one year and see how things stack up:

years <- years[-4,] #remove 2008
sapply(years, mean)

And we obtain the following results:

> sapply(years, mean)
portfolio.returns      SPY.Adjusted 
        0.1088025         0.1432787 

In short, the so-called outperformance is largely sample-dependent on the S&P having a terrible year. Beyond that, things don’t look so stellar for the strategy.

Lastly, anytime anyone makes a claim about a strategy outperforming a benchmark, one very, very easy test to do is to simply look at the equity curve of a “market neutral” strategy that shorts the benchmark against the strategy, and see if that looks like an equity curve one would be comfortable holding. Here are the results:

diff <- comparison[,1] - comparison[,2]
charts.PerformanceSummary(diff, main="short SPY against portfolio")

And the resulting equity curve of the outperformance:

Basically, ever since the crisis passed, the strategy has been flat to underperforming against SPY.

To hammer the point, let’s look at the equity curves from 2009 onwards.

#strategy against SPY post-crisis
charts.PerformanceSummary(comparison["2009::"])

With the result:

To its credit, the strategy’s equity curve is certainly a smoother ride. At the same time, the underperformance is clear as well. I suppose this makes sense as a portfolio invested 35% in bonds should expect to underperform a portfolio fully invested in equities on an absolute returns basis. Nevertheless, a far cry from the marketing the strategy has received in the original article.

Now, as I did before, in It’s Amazing How Well Dumb Things Get Marketed, I’m going to backtest this strategy as far as I can using proxies from Quandl for gold and bonds, by going directly to the futures. Refer to that post to see that the proxies are relatively decent for the ETFs. In this case, I switched from adjusted to close prices, but the concept should still hold.

require(IKTrading)
goldFutures <- quandClean(stemCode = "CHRIS/CME_GC", verbose = TRUE)
thirtyBond <- quandClean(stemCode="CHRIS/CME_US", verbose = TRUE)
longPrices <- cbind(Cl(XLP), Cl(goldFutures), Cl(thirtyBond))
longRets <- Return.calculate(longPrices)
longRets <- longRets[!is.na(longRets[,1]),]
names(longRets) <- c("XLP", "Gold", "Bonds")
longWarthog <- Return.portfolio(longRets, weights=c(.5, .15, .35), rebalance_on = "years")
longComparison <- merge(longWarthog, SPYrets, join='inner')
charts.PerformanceSummary(longComparison)

And here’s the resultant equity curve comparison:

So, the idea that the strategy outperforms SPY? It seems to be due to SPY’s horrendous performance in 2008, which is a once-in-several-decades event. As you can see, if you started from XLP’s inception to approximately 2005, the strategy was actually in drawdown for all those years.

Let’s look at the risk/reward metrics for this timeframe:

stats <- rbind(Return.annualized(longComparison), 
      maxDrawdown(longComparison),
      SharpeRatio.annualized(longComparison),
      Return.annualized(longComparison)/maxDrawdown(longComparison)
)
rownames(stats)[4] <- "Return Over MaxDD"
stats

And the results:

> stats
                                portfolio.returns SPY.Adjusted
Annualized Return                       0.0409267   0.05344120
Worst Drawdown                          0.2439091   0.55186722
Annualized Sharpe Ratio (Rf=0%)         0.4846184   0.26295594
Return Over MaxDD                       0.1677949   0.09683706

To its credit, the strategy has a better return to risk profile than SPY does, which I hope would be the case for any backtest going up against SPY with 2008 in mind. Nevertheless, I feel there was definitely a veneer of marketing in the original presentation. Overall, the strategy has a bit of merit as a defensive play, but in all honesty, looking at the return over max drawdown, I think there are ways to do better.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

An Update to the Robustness Heuristic and a Variation of a Volatility Strategy

So, before revealing a slight wrinkle on the last strategy I wrote about, I’d like to clear up a bit of confusion regarding Jaekle and Tomasini’s idea of a stable region.

Essentially, the entire idea *is* that similar parameter configurations behave in very similar ways, and so, are supposed to be highly correlated. It does not mean the strategy may not be overfit in other ways, but that incremental changes to a parameter should mean incremental changes to performance, rather than seeing some sort of lucky spike in a sea of poor performance.

In any case, the one change to the strategy from last week is that rather than get in at the current close (aka observe close, execute at close), to get in at the next day’s close.

Again, here’s the strategy script:

require(downloader)
require(PerformanceAnalytics)
require(TTR)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", 
         destfile="vxvData.csv")
download("http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/vxmtdailyprices.csv", 
         destfile="vxmtData.csv")

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
ratio <- Cl(vxv)/Cl(vxmt)


download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))

xiv <- merge(xiv, ratio, join='inner')
vxx <- merge(vxx, ratio, join='inner')
colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio"

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

retsList <- list()
count <- 1
for(i in 10:200) {
  ratioSMA <- SMA(ratio, n=i)
  vxxSig <- lag(ratio > 1 & ratio > ratioSMA, 2)
  xivSig <- lag(ratio < 1 & ratio < ratioSMA, 2)
  rets <- vxxSig*vxxRets + xivSig*xivRets
  colnames(rets) <- i
  retsList[[i]]  <- rets
  count <- count+1  
}
retsList <- do.call(cbind, retsList)
colnames(retsList) <- gsub("X", "", colnames(retsList))
charts.PerformanceSummary(retsList)
retsList <- retsList[!is.na(retsList[,191]),]
retsList <- retsList[-1,]

The one change I made is that rather than go with the default lag value, I went with a lag of 2. A lag of zero induces look-ahead bias. In any case, let’s run through the process again of analyzing for robustness.

rankComparison <- function(rets, perfAfun="Return.cumulative") {
  fun <- match.fun(perfAfun)
  monthlyFun <- apply.monthly(rets, fun)
  monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank))
  meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean)
  rankMMR <- rank(meanMonthlyRank)
  
  aggFun <- fun(rets)
  aggFunRank <- rank(aggFun)
  
  bothRanks <- data.frame(cbind(aggFunRank, rankMMR, names(rankMMR)), stringsAsFactors=FALSE)
  names(bothRanks) <- c("aggregateRank", "averageMonthlyRank", "configName")
  bothRanks$aggregateRank <- as.numeric(bothRanks$aggregateRank)
  bothRanks$averageMonthlyRank <- as.numeric(bothRanks$averageMonthlyRank)
  bothRanks$sum <- bothRanks[,1] + bothRanks[,2]
  bothRanks <- bothRanks[order(bothRanks$sum, decreasing=TRUE),]
  
  plot(aggFunRank~rankMMR, main=perfAfun)
  print(cor(aggFunRank, meanMonthlyRank))
  return(bothRanks)
}

retRank <- rankComparison(retsList)
sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized")

In this case, I added some functionality to not only do the plotting and correlation, but to spit out a table comparing both the aggregate metric along with the rank of the average monthly rank (again, dual ranking layer), and ordered the table by the sum of both the aggregate and the monthly metric, starting with the highest.

For instance, here’s the output from the returns comparison:

> retRank <- rankComparison(retsList)
[1] 0.736377

> head(retRank, 20)
    aggregateRank averageMonthlyRank configName sum
62            190                191         62 381
63            189                187         63 376
60            185                189         60 374
66            191                182         66 373
65            187                183         65 370
59            184                185         59 369
56            181                186         56 367
64            188                179         64 367
152           174                190        152 364
61            183                178         61 361
67            186                167         67 353
151           165                188        151 353
57            179                173         57 352
153           167                184        153 351
58            182                164         58 346
154           170                175        154 345
53            164                180         53 344
155           166                176        155 342
158           163                177        158 340
150           157                181        150 338

So, for this configuration, the correlation went down from above .8 to around .74…which is still strong and credence that the strategy configurations have validity outside some lucky months. The new feature I added was the data frame of the two ranks side by side, along with their configuration name (in this case, my names were simply the SMA parameter, but the names could be anything such as say, SMA_60_lag_2), and the sum of the two rankings, which orders the configurations. As there were 191 configurations (SMA ranging from 10 to 200), the best score that could be achieved was 382. Furthermore, note that although there seems to be a strong region from SMA 53 to SMA 67, there also seems to be another region, at least when it comes to absolute return, of an SMA parameter at SMA 150+.

Here’s the same table for annualized Sharpe (this variation takes a bit longer to compute due to the monthly annualized Sharpes).

> sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized")
[1] 0.5590881
> head(sharpeRank, 20)
    aggregateRank averageMonthlyRank configName   sum
62            190              191.0         62 381.0
59            185              190.0         59 375.0
61            183              186.5         61 369.5
60            186              181.0         60 367.0
63            189              175.0         63 364.0
66            191              164.0         66 355.0
152           166              173.0        152 339.0
58            182              155.0         58 337.0
56            181              151.0         56 332.0
53            174              153.0         53 327.0
57            179              148.0         57 327.0
151           159              162.0        151 321.0
76            177              143.0         76 320.0
150           152              163.0        150 315.0
54            173              140.0         54 313.0
77            178              131.0         77 309.0
65            187              119.0         65 306.0
143           146              156.0        143 302.0
74            167              132.0         74 299.0
153           161              138.0        153 299.0

So, largely the same sort of results as we see with the annualized returns. A correlation of .5 gives some cause for concern, which will hopefully show up in the line plot of the rank of the four metrics (returns, Sharpe, drawdowns, and return to drawdown), which will reveal the regions with strong performance, and not-so-strong performances.

Here’s the ranking line plot.

aggReturns <- Return.annualized(retsList)
aggSharpe <- SharpeRatio.annualized(retsList)
aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList)
aggDD <- maxDrawdown(retsList)

plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA", 
     main="Risk and return rank comparison")
lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue")
lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red")
lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green")
legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))

And the resulting plot:

There are several regions that show similar, strong metrics for similar parameter choices for the value of SMA when we use a “delayed” entry. Namely, the regions around the 60 day SMA, the 150 day SMA, and the 125 day SMA.

Let’s look at those configurations.

truncRets <- retsList[,c(51, 116, 141)]
stats <- data.frame(cbind(t(Return.annualized(truncRets)),
                 t(SharpeRatio.annualized(truncRets)),
                 t(maxDrawdown(truncRets))))
colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown")
stats$MAR <- stats[,1]/stats[,3]
stats <- round(stats, 3)

And the results:

> stats
    A.Return A.Sharpe Worst_Drawdown   MAR
60     1.103    2.490          0.330 3.342
125    0.988    2.220          0.368 2.683
150    0.983    2.189          0.404 2.435

And the resulting performance, on both a regular, and log scale:

charts.PerformanceSummary(truncRets)

logRets <- log(cumprod(1+truncRets))
chart.TimeSeries(logRets)


Perfect strategies? There’s probably room for improvement. As good if not better than the volatility strategies posted elsewhere on the internet? Probably. Is there more investigation that can be done regarding the differences in signal delay? Yes.

So, in conclusion for this post, I’m hoping that the rank comparison heuristic and its new output gives people another tool to consider, along with another vol strategy to consider as well.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

A New Volatility Strategy, And A Heuristic For Analyzing Robustness

This post is motivated by a discussion that arose when I tested a strategy by Frank of Trading The Odds (post here). One point, brought up by Tony Cooper of Double Digit Numerics, the original author of the paper that Trading The Odds now trades (I consider it a huge honor that my blog is read by authors of original trading strategies), is that my heatmap analysis only looked cross-sectional performance, as opposed to performance over time–that is, performance that could have been outstanding over the course of the entire backtest could have been the result of a few lucky months. This is a fair point, which I hope this post will address in terms of a heuristic using both visual and analytical outputs.

The strategy for this post is the following, provided to me kindly by Mr. Helmuth Vollmeier (whose help in all my volatility-related investigations cannot be understated):

Consider VXV and VXMT, the three month and six month implied volatility on the usual SP500. Define contango as VXV/VXMT < 1, and backwardation vice versa. Additionally, take an SMA of said ratio. Go long VXX when the ratio is greater than 1 and above its SMA, and go long XIV when the converse holds. Or in my case, get in at the close when that happens and exit at the next day's close after the converse occurs (that is, my replication is slightly off due to using some rather simplistic coding for illustrative purposes).

In any case, here is the script for setting up the strategy, most of which is just downloading the data–the strategy itself is just a few lines of code:

require(downloader)
require(PerformanceAnalytics)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", 
         destfile="vxvData.csv")
download("http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/vxmtdailyprices.csv", 
         destfile="vxmtData.csv")

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
ratio <- Cl(vxv)/Cl(vxmt)


download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))

xiv <- merge(xiv, ratio, join="inner")
vxx <- merge(vxx, ratio, join="inner")
colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio"

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

retsList <- list()
count <- 1
for(i in 10:200) {
  ratioSMA <- SMA(ratio, n=i)
  vxxSig <- lag(ratio > 1 & ratio > ratioSMA)
  xivSig <- lag(ratio < 1 & ratio < ratioSMA)
  rets <- vxxSig*vxxRets + xivSig*xivRets
  colnames(rets) <- i
  retsList[[i]]  <- rets
  count <- count+1  
}
retsList <- do.call(cbind, retsList)
colnames(retsList) <- gsub("X", "", colnames(retsList))
charts.PerformanceSummary(retsList)
retsList <- retsList[!is.na(retsList[,191]),]
retsList <- retsList[-1,]

retsList <- retsList["::2014-11-28"] #for monthly aggregation, remove start of Dec 2014

About as straightforward as things get (the results, as we'll see, are solid, as well). In this case, I tested every SMA between a 10 day and a classic 200 day SMA. And since this strategy is a single-parameter strategy (unless you want to get into adjusting the ratio critical values up and down away from 1), instead of heatmaps, we'll suffice with basic scatter plots and line plots (which make things about as simple as they come).

The heuristic I decided upon was to take some PerfA functions (Return.annualized, SharpeRatio.annualized, for instance), and compare the rank of the average of their monthly ranks (that is, a two-layer rank, very similar to the process in Flexible Asset Allocation) to the aggregate, whole time-period rank. The idea here is that performance based on a few lucky months may have a high aggregate ranking, but a much lower monthly ranking, which would be reflected in a scatter plot. Ideally, the scatter plot would go from lower left to upper right in terms of ranks comparisons, with a correlation of 1, meaning that the strategy with the best overall return will have the best average monthly return rank, and so on down the list. This can also apply to the Sharpe ratio, and so on.

Here is my off-the-cuff implementation of such an idea:

rankComparison <- function(rets, perfAfun="Return.cumulative") {
  fun <- match.fun(perfAfun)
  monthlyFun <- apply.monthly(rets, fun)
  monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank))
  meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean)
  rankMMR <- rank(meanMonthlyRank)
  
  aggFun <- fun(rets)
  aggFunRank <- rank(aggFun)
  plot(aggFunRank~rankMMR, main=perfAfun)
  print(cor(aggFunRank, meanMonthlyRank))
}

So, I get a chart and a correlation of average monthly ranks against a single-pass whole-period rank. Here are the results for cumulative returns and Sharpe ratio:

> rankComparison(retsList)
[1] 0.8485374

Basically, the interpretation is this: the outliers above and to the left of the main cluster can be interpreted as those having those “few lucky months”, while those to the lower right consistently perform somewhat well, but for whatever reason, are just stricken with bad luck. However, the critical results that we’re looking for is that the best overall performers (the highest aggregate rank) are also those with the most *consistent* performance (the highest monthly rank), which is generally what we see.

Furthermore, the correlation of .85 also lends credence that this is a robust strategy.

Here’s the process repeated with the annualized Sharpe ratio:

> rankComparison(retsList, perfAfun="SharpeRatio.annualized")
[1] 0.8647353

In other words, an even clearer relationship here, and again, we see that the best performers overall are also the best monthly performers, so we can feel safe in the robustness of the strategy.

So what’s the punchline? Well, the idea is now that we’ve established that the best results on aggregate are also the strongest results when analyzing the results across time, let’s look to see if the various rankings of risk and reward metrics reveal which configurations those are.

Here’s a chart of the aggregate rankings of annualized return (aka cumulative return), annualized Sharpe, MAR (return over max drawdown), and max drawdown.

aggReturns <- Return.annualized(retsList)
aggSharpe <- SharpeRatio.annualized(retsList)
aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList)
aggDD <- maxDrawdown(retsList)

plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA",
     main="Risk and return rank comparison")
lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue")
lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red")
lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green")
legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))

And the resulting plot itself:

So, looking at these results, here are some interpretations, moving from left to right:

At the lower end of the SMA, the results are just plain terrible. Sure, the drawdowns are lower, but the returns are in the basement.
The spike around the 50-day SMA makes me question if there is some sort of behavioral bias at work here.
Next, there’s a region with fairly solid performance between that and the 100-day SMA, but is surrounded on both sides by pretty abysmal performance.
Moving onto the 100-day SMA region, the annualized returns and Sharpe ratios are strong, but get the parameter estimation incorrect going forward, and there’s a severe risk of incurring heavy drawdowns. The jump improvement in the drawdown metric is also interesting. Again, is there some sort of bias towards some of the round numbers? (50, 100, etc.)
Lastly, there’s nothing particularly spectacular about the performances until we get to the high 100s and the 200 day SMA, at which point, we see a stable region of configurations with high ranks in all categories.

Let’s look at that region more closely:

truncRets <- retsList[,161:191]
stats <- data.frame(cbind(t(Return.annualized(truncRets)),
                 t(SharpeRatio.annualized(truncRets)),
                 t(maxDrawdown(truncRets))))
colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown")
stats$MAR <- stats[,1]/stats[,3]
stats <- round(stats, 3)

And the results:

> stats
    A.Return A.Sharpe Worst_Drawdown   MAR
170    0.729    1.562          0.427 1.709
171    0.723    1.547          0.427 1.693
172    0.723    1.548          0.427 1.694
173    0.709    1.518          0.427 1.661
174    0.711    1.522          0.427 1.665
175    0.711    1.522          0.427 1.665
176    0.711    1.522          0.427 1.665
177    0.711    1.522          0.427 1.665
178    0.696    1.481          0.427 1.631
179    0.667    1.418          0.427 1.563
180    0.677    1.441          0.427 1.586
181    0.677    1.441          0.427 1.586
182    0.677    1.441          0.427 1.586
183    0.675    1.437          0.427 1.582
184    0.738    1.591          0.427 1.729
185    0.760    1.637          0.403 1.886
186    0.794    1.714          0.403 1.970
187    0.798    1.721          0.403 1.978
188    0.802    1.731          0.403 1.990
189    0.823    1.775          0.403 2.042
190    0.823    1.774          0.403 2.041
191    0.823    1.774          0.403 2.041
192    0.819    1.765          0.403 2.031
193    0.822    1.772          0.403 2.040
194    0.832    1.792          0.403 2.063
195    0.832    1.792          0.403 2.063
196    0.802    1.723          0.403 1.989
197    0.810    1.741          0.403 2.009
198    0.782    1.677          0.403 1.941
199    0.781    1.673          0.403 1.937
200    0.779    1.670          0.403 1.934

So starting from SMA 186 through SMA 200, we see some fairly strong performance–returns in the high 70s to the low 80s, and MARs in the high 1s to low 2s. And of course, since this is about a trading strategy, equity curves are of course, obligatory. Here is what that looks like:

strongRets <- retsList[,177:191]
charts.PerformanceSummary(strongRets)

Basically, on aggregate, some very strong performance. However, it is certainly not *smooth* performance. New equity highs are followed by strong drawdowns, which are then followed by a recovery and new, higher equity highs.

To conclude (for the moment, I’ll have a new post on this next week with a slight wrinkle that gets even better results), I hope that I presented not only a simple but effective strategy, but also a simple but effective (if a bit time consuming, to do all the monthly computations on 191 return streams) heuristic suggested/implied by Tony Cooper of double digit numerics for analyzing the performance and robustness of your trading strategies. Certainly, while many professors and theorists elucidate on robustness (with plenty of math that makes stiff bagels look digestible), I believe not a lot of attention is actually paid to it in more common circles, using more intuitive methods. After all, if someone would want to be an unscrupulous individual selling trading systems or signals (instead of worrying about the strategy’s capacity for capital), it’s easy to show an overfit equity curve while making up some excuse so as to not reveal the (most likely overfit) strategy. One thing I’d hope this post inspires is for individuals to ask not only look at equity curves, but also plots such as aggregate against average monthly (or higher frequencies, if the strategies are tested over mere months, for instance, such as intraday trading) metric rankings when performing parameter optimization.

Is this heuristic the most robust and advanced that can be done? Probably not. Would one need to employ even more advanced techniques if computing time becomes an issue? Probably (bootstrapping and sampling come to mind). Can this be built on? Of course. *Will* someone build on it? I certainly plan on revisiting this topic in the future.

Lastly, on the nature of the strategy itself: while Trading The Odds presented a strategy functioning on a very short time frame, I’m surprised that instead, we have a strategy whose parameters are on a much higher end of the numerical spectrum.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

An Update on Flexible Asset Allocation

A few weeks back, after seeing my replication, one of the original authors of the Flexible Asset Allocation paper got in touch with me to tell me to make a slight adjustment to the code, in that rather than remove any negative-momentum securities before performing any ranking, to perform all ranking without taking absolute momentum into account, and only removing negative absolute momentum securities at the very end, after allocating weights.

Here’s the new code:

FAA <- function(prices, monthLookback = 4,
                weightMom = 1, weightVol = .5, weightCor = .5, 
                riskFreeName = NULL, bestN = 3,
                stepCorRank = FALSE, stepStartMethod = c("best", "default"),
                geometric = TRUE, ...) {
  stepStartMethod <- stepStartMethod[1]
  if(is.null(riskFreeName)) {
    prices$zeroes <- 0
    riskFreeName <- "zeroes"
    warning("No risk-free security specified. Recommended to use one of: quandClean('CHRIS/CME_US'), SHY, or VFISX. 
            Using vector of zeroes instead.")
  }
  returns <- Return.calculate(prices)
  monthlyEps <- endpoints(prices, on = "months")
  riskFreeCol <- grep(riskFreeName, colnames(prices))
  tmp <- list()
  dates <- list()
  
  for(i in 2:(length(monthlyEps) - monthLookback)) {
    #subset data
    priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
    returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]
    
    #perform computations
    momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
    momentum <- momentum[,!is.na(momentum)]
    #momentum[is.na(momentum)] <- -1 #set any NA momentum to negative 1 to keep R from crashing
    priceData <- priceData[,names(momentum)]
    returnsData <- returnsData[,names(momentum)]
    
    momRank <- rank(momentum)
    vols <- data.frame(StdDev(returnsData))
    volRank <- rank(-vols)
    cors <- cor(returnsData, use = "complete.obs")
    if (stepCorRank) {
      if(stepStartMethod=="best") {
        compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
        maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
        corRank <- stepwiseCorRank(corMatrix=cors, startNames = names(maxRank), 
                                        bestHighestRank = TRUE, ...)
        
      } else {
        corRank <- stepwiseCorRank(corMatrix=cors, bestHighestRank=TRUE, ...)
      }
    } else {
      corRank <- rank(-rowSums(cors))
    }
    
    totalRank <- rank(weightMom*momRank + weightVol*volRank + weightCor*corRank)
    
    upper <- length(names(returnsData))
    lower <- max(upper-bestN+1, 1)
    topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]
    
    #compute weights
    longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
    longs[momentum < 0] <- 0 #in previous algorithm, removed momentums < 0, this time, we zero them out at the end.
    longs <- longs/sum(longs) #equal weight all candidates
    longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
    names(longs) <- names(totalRank)
    
    
    #append removed names (those with momentum < 0)
    removedZeroes <- rep(0, ncol(returns)-length(longs))
    names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
    longs <- c(longs, removedZeroes)
    
    #reorder to be in the same column order as original returns/prices
    longs <- data.frame(t(longs))
    longs <- longs[, names(returns)]
    
    #append lists
    tmp[[i]] <- longs
    dates[[i]] <- index(returnsData)[nrow(returnsData)]
  }
  
  weights <- do.call(rbind, tmp)
  dates <- do.call(c, dates)
  weights <- xts(weights, order.by=as.Date(dates)) 
  weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
  strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = geometric)
  colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
  return(strategyReturns)
}

And here are the new results, both with the original configuration, and using the stepwise correlation ranking algorithm introduced by David Varadi:

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
                 "FDIVX", #Fidelity Diversified International Fund
                 "VEIEX", #Vanguard Emerging Markets Stock Index Fund
                 "VFISX", #Vanguard Short-Term Treasury Fund
                 "VBMFX", #Vanguard Total Bond Market Index Fund
                 "QRAAX", #Oppenheimer Commodity Strategy Total Return 
                 "VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2014-10-30")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

original <- FAA(adPrices, riskFreeName="VFISX")
swc <- FAA(adPrices, riskFreeName="VFISX", stepCorRank = TRUE)
originalOld <- FAAreturns(adPrices, riskFreeName="VFISX")
swcOld <- FAAreturns(adPrices, riskFreeName="VFISX", stepCorRank=TRUE)
all4 <- cbind(original, swc, originalOld, swcOld)
names(all4) <- c("original", "swc", "origOld", "swcOld")
charts.PerformanceSummary(all4)
> rbind(Return.annualized(all4)*100,
+       maxDrawdown(all4)*100,
+       SharpeRatio.annualized(all4))
                                 original       swc   origOld    swcOld
Annualized Return               12.795205 14.135997 13.221775 14.037137
Worst Drawdown                  11.361801 11.361801 13.082294 13.082294
Annualized Sharpe Ratio (Rf=0%)  1.455302  1.472924  1.377914  1.390025

And the resulting equity curve comparison

Overall, it seems filtering on absolute momentum after applying all weightings using only relative momentum to rank actually improves downside risk profiles ever so slightly compared to removing negative momentum securities ahead of time. In any case, FAAreturns will be the function that removes negative momentum securities ahead of time, and FAA will be the ones that removes them after all else is said and done.

I’ll return to the standard volatility trading agenda soon.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Trading The Odds Volatility Risk Premium: Addressing Data Mining and Curve-Fitting

Several readers, upon seeing the risk and return ratio along with other statistics in the previous post stated that the result may have been the result of data mining/over-optimization/curve-fitting/overfitting, or otherwise bad practice of creating an amazing equity curve whose performance will decay out of sample.

Fortunately, there’s a way to test that assertion. In their book “Trading Systems: A New Approach to System Development and Portfolio Optimization”, Urban Jaekle and Emilio Tomasini use the concept of the “stable region” to demonstrate a way of visualizing whether or not a parameter specification is indeed overfit. The idea of a stable region is that going forward, how robust is a parameter specification to slight changes? If the system just happened to find one good small point in a sea of losers, the strategy is likely to fail going forward. However, if small changes in the parameter specifications still result in profitable configurations, then the chosen parameter set is a valid configuration.

As Frank’s trading strategy only has two parameters (standard deviation computation period, aka runSD for the R function, and the SMA period), rather than make line graphs, I decided to do a brute force grid search just to see other configurations, and plotted the results in the form of heatmaps.

Here’s the modified script for the computations (no parallel syntax in use for the sake of simplicity):

download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxmt <- xts(read.zoo("vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

getSymbols("^VIX", from="2004-03-29")

vixvxmt <- merge(Cl(VIX), Cl(vxmt))
vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

getSymbols("^GSPC", from="1990-01-01")
spyRets <- diff(log(Cl(GSPC)))

t1 <- Sys.time()
MARmatrix <- list()
SharpeMatrix <- list()
for(i in 2:21) {
  
  smaMAR <- list()
  smaSharpe <- list()
  for(j in 2:21){
    spyVol <- runSD(spyRets, n=i)
    annSpyVol <- spyVol*100*sqrt(252)
    vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
    
    
    vols$smaDiff <- SMA(vols[,1] - vols[,2], n=j)
    vols$signal <- vols$smaDiff > 0
    vols$signal <- lag(vols$signal, k = 1)
    
    stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets
    #charts.PerformanceSummary(stratRets)
    #stratRets[is.na(stratRets)] <- 0
    #plot(log(cumprod(1+stratRets)))
    
    stats <- data.frame(cbind(Return.annualized(stratRets)*100, 
                              maxDrawdown(stratRets)*100, 
                              SharpeRatio.annualized(stratRets)))
    
    colnames(stats) <- c("Annualized Return", "Max Drawdown", "Annualized Sharpe")
    MAR <- as.numeric(stats[1])/as.numeric(stats[2])    
    smaMAR[[j-1]] <- MAR
    smaSharpe[[j-1]] <- stats[,3]
  }
  rm(vols)
  smaMAR <- do.call(c, smaMAR)
  smaSharpe <- do.call(c, smaSharpe)
  MARmatrix[[i-1]] <- smaMAR
  SharpeMatrix[[i-1]] <- smaSharpe
}
t2 <- Sys.time()
print(t2-t1)

Essentially, just wrap the previous script in a nested for loop over the two parameters.

I chose GGplot2 to plot the heatmaps for more control with coloring.

Here’s the heatmap for the MAR ratio (that is, returns over max drawdown):

MARmatrix <- do.call(cbind, MARmatrix)
rownames(MARmatrix) <- paste0("SMA", c(2:21))
colnames(MARmatrix) <- paste0("runSD", c(2:21))
MARlong <- melt(MARmatrix)
colnames(MARlong) <- c("SMA", "runSD", "MAR")
MARlong$SMA <- as.numeric(gsub("SMA", "", MARlong$SMA))
MARlong$runSD <- as.numeric(gsub("runSD", "", MARlong$runSD))
MARlong$scaleMAR <- scale(MARlong$MAR)
ggplot(MARlong, aes(x=SMA, y=runSD, fill=scaleMAR))+geom_tile()+scale_fill_gradient2(high="skyblue", mid="blue", low="red")

Here’s the result:

Immediately, we start to see some answers to questions regarding overfitting. First off, is the parameter set published by TradingTheOdds optimized? Yes. In fact, not only is it optimized, it’s by far and away the best value on the heatmap. However, when discussing overfitting, curve-fitting, and the like, the question to ask isn’t “is this the best parameter set available”, but rather “is the parameter set in a stable region?” The answer, in my opinion to that, is yes, as noted by the differing values of the SMA for the 2-day sample standard deviation. Note that this quantity, due to being the sample standard deviation, is actually the square root of the two squared residuals of that time period.

Here are the MAR values for those configurations:

> MARmatrix[1:10,1]
    SMA2     SMA3     SMA4     SMA5     SMA6     SMA7     SMA8     SMA9    SMA10    SMA11 
2.471094 2.418934 2.067463 3.027450 2.596087 2.209904 2.466055 1.394324 1.860967 1.650588 

In this case, not only is the region stable, but the MAR values are all above 2 until the SMA9 value.

Furthermore, note that aside from the stable region of the 2-day sample standard deviation, a stable region using a standard deviation of ten days with less smoothing from the SMA (because there’s already an average inherent in the sample standard deviation) also exists. Let’s examine those values.

> MARmatrix[2:5, 9:16]
      runSD10  runSD11  runSD12  runSD13  runSD14  runSD15  runSD16   runSD17
SMA3 1.997457 2.035746 1.807391 1.713263 1.803983 1.994437 1.695406 1.0685859
SMA4 2.167992 2.034468 1.692622 1.778265 1.828703 1.752648 1.558279 1.1782665
SMA5 1.504217 1.757291 1.742978 1.963649 1.923729 1.662687 1.248936 1.0837615
SMA6 1.695616 1.978413 2.004710 1.891676 1.497672 1.471754 1.194853 0.9326545

Apparently, a standard deviation between 2 and 3 weeks with minimal SMA smoothing also produced some results comparable to the 2-day variant.

Off to the northeast of the plot, using longer periods for the parameters simply causes the risk-to-reward performance to drop steeply. This is essentially an illustration of the detriments of lag.

Finally, there’s a small rough patch between the two aforementioned stable regions. Here’s the data for that.

> MARmatrix[1:5, 4:8]
       runSD5    runSD6    runSD7   runSD8   runSD9
SMA2 1.928716 1.5825265 1.6624751 1.033216 1.245461
SMA3 1.528882 1.5257165 1.2348663 1.364103 1.510653
SMA4 1.419722 0.9497827 0.8491229 1.227064 1.396193
SMA5 1.023895 1.0630939 1.3632697 1.547222 1.465033
SMA6 1.128575 1.3793244 1.4085513 1.440324 1.964293

As you can see, there are some patches where the MAR is below 1, and many where it’s below 1.5. All of these are pretty detached from the stable regions.

Let’s repeat this process with the Sharpe Ratio heatmap.

SharpeMatrix <- do.call(cbind, SharpeMatrix)
rownames(SharpeMatrix) <- paste0("SMA", c(2:21))
colnames(SharpeMatrix) <- paste0("runSD", c(2:21))
sharpeLong <- melt(SharpeMatrix)
colnames(sharpeLong) <- c("SMA", "runSD", "Sharpe")
sharpeLong$SMA <- as.numeric(gsub("SMA", "", sharpeLong$SMA))
sharpeLong$runSD <- as.numeric(gsub("runSD", "", sharpeLong$runSD))
ggplot(sharpeLong, aes(x=SMA, y=runSD, fill=Sharpe))+geom_tile()+
  scale_fill_gradient2(high="skyblue", mid="blue", low="darkred", midpoint=1.5)

And the result:

Again, the TradingTheOdds parameter configuration lights up, but among a region of strong configurations. This time, we can see that in comparison to the rest of the heatmap, the northern stable region seems to have become clustered around the 10-day standard deviation (or 11) with SMAs of 2, 3, and 4. The regions to the northeast are also more subdued by comparison, with the Sharpe ratio bottoming out around 1.

Let’s look at the numerical values again for the same regions.

Two-day standard deviation region:

> SharpeMatrix[1:10,1]
    SMA2     SMA3     SMA4     SMA5     SMA6     SMA7     SMA8     SMA9    SMA10    SMA11 
1.972256 2.210515 2.243040 2.496178 1.975748 1.965730 1.967022 1.510652 1.963970 1.778401 

Again, numbers the likes of which I myself haven’t been able to achieve with more conventional strategies, and numbers the likes of which I haven’t really seen anywhere for anything on daily data. So either the strategy is fantastic, or something is terribly wrong outside the scope of the parameter optimization.

Two week standard deviation region:

> SharpeMatrix[1:5, 9:16]
      runSD10  runSD11  runSD12  runSD13  runSD14  runSD15  runSD16  runSD17
SMA2 1.902430 1.934403 1.687430 1.725751 1.524354 1.683608 1.719378 1.506361
SMA3 1.749710 1.758602 1.560260 1.580278 1.609211 1.722226 1.535830 1.271252
SMA4 1.915628 1.757037 1.560983 1.585787 1.630961 1.512211 1.433255 1.331697
SMA5 1.684540 1.620641 1.607461 1.752090 1.660533 1.500787 1.359043 1.276761
SMA6 1.735760 1.765137 1.788670 1.687369 1.507831 1.481652 1.318751 1.197707

Again, pretty outstanding numbers.

The rough patch:

> SharpeMatrix[1:5, 4:8]
       runSD5   runSD6   runSD7   runSD8   runSD9
SMA2 1.905192 1.650921 1.667556 1.388061 1.454764
SMA3 1.495310 1.399240 1.378993 1.527004 1.661142
SMA4 1.591010 1.109749 1.041914 1.411985 1.538603
SMA5 1.288419 1.277330 1.555817 1.753903 1.685827
SMA6 1.278301 1.390989 1.569666 1.650900 1.777006

All Sharpe ratios higher than 1, though some below 1.5

So, to conclude this post:

Was the replication using optimized parameters? Yes. However, those optimized parameters were found within a stable (and even strong) region. Furthermore, it isn’t as though the strategy exhibits poor risk-to-return metrics beyond those regions, either. Aside from raising the lookback period on both the moving average and the standard deviation to levels that no longer resemble the original replication, performance was solid to stellar.

Does this necessarily mean that there is nothing wrong with the strategy? No. It could be that the performance is an artifact of “observe the close, enter at the close” optimistic execution assumptions. For instance, quantstrat (the go-to backtest engine in R for more trading-oriented statistics) uses a next-bar execution method that defaults on the *next* day’s close (so if you look back over my quantstrat posts, I use prefer=”open” so as to get the open of the next bar, instead of its close). It could also be that VXMT itself is an instrument that isn’t very well known in the public sphere, either, seeing as how Yahoo finance barely has any data on it. Lastly, it could simply be the fact that although the risk to reward ratios seem amazing, many investors/mutual fund managers/etc. probably don’t want to think “I’m down 40-60% from my peak”, even though it’s arguably easier to adjust a strategy with a good reward to risk ratio with excess risk by adding cash (to use a cooking analogy, think about your favorite spice. Good in small quantities.), than it is to go and find leverage for a good reward to risk strategy with very small returns (not to mention incurring all the other risks that come with leverage to begin with, such as a 50% drawdown wiping out an account leveraged two to one).

However, to address the question of overfitting, through a modified technique from Jaekle and Tomasini (2009), these are the results I found.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Volatility Risk Premium: Sharpe 2+, Return to Drawdown 3+

First, before starting this post, I’d like to give one last comment about my previous post:

I called Vanguard to inquire about the trading policies on VWEHX and VFISX, and there are two-month cooldown periods (aka frequent-trading policies) on those mutual funds. However, the HYG ETF does indeed pay dividends, so the adjusted ETF variant is most likely the closest performance an investor can expect. Still, a Sharpe ratio higher than 1.25 is nothing to scoff at. Of course, no transaction costs are assumed on any of my strategies, so make sure your broker isn’t ripping you off if you actually intend on seriously investing in anything I publish on this blog (I hear interactive brokers has $1 per transaction), and once again, remember that none of this constitutes official advice.

Now, onto this post:

Judging by the attention some of my previous volatility posts have garnered through my replication of SeekingAlpha strategies, today, I am going to share a strategy whose statistics boggle my mind.

The strategy was presented by TradingTheOdds in this post. I had to replicate it for myself to be sure it worked as advertised, but unless I have something horrendously incorrect, this strategy works…quite well. Here’s the strategy:

Using the actual S&P 500 index, compute the two-day annualized historical volatility. Subtract that from the VXMT, which is the six-month expected volatility of the S&P 500 (prior to 2008, use the actual VIX). Then, take the 5-day SMA of that difference. If this number is above 0, go long XIV, otherwise go long VXX. In my replication, the strategy uses market-on-close orders (AKA observe “near” the close, buy at the close), so the strategy should be taken with a little bit of a grain of salt. Here’s the code:

download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxmt <- xts(read.zoo("vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

getSymbols("^VIX", from="2004-03-29")

vixvxmt <- merge(Cl(VIX), Cl(vxmt))
vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

getSymbols("^GSPC", from="1990-01-01")
spyRets <- diff(log(Cl(GSPC)))

spyVol <- runSD(spyRets, n=2)
annSpyVol <- spyVol*100*sqrt(252)

vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
vols$smaDiff <- SMA(vols[,1] - vols[,2], n=5)
vols$signal <- vols$smaDiff > 0
vols$signal <- lag(vols$signal, k = 1)

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))
stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets

The VXMT data is taken from the link I showed earlier. So, as I interpret it, to me, this strategy seems to be stating this:

Since we are subtracting the long-term expected volatility (VXMT) from the near-term historical volatility, which I suppose is meant to be a proxy for the “forecast of current volatility”, the implied hypothesis seems to be that volatility is being overestimated by the VXMT, so we should go short volatility. Conversely, if the near-term historical volatility is higher than the expected volatility, it means we should be long volatility instead. So, here’s the punchline that is the equity curve:

charts.PerformanceSummary(stratRets)

Yes, you’re looking at that correctly–over approximately ten years (slightly longer), this strategy had a cumulative return of about $10,000 for every $1 invested. Just to put this into perspective, here’s a log-scale equity curve.

There are a few noticeable dips which correspond to around 40% drawdowns on the regular scale. Now, let’s look at the usual statistics:

stats <- data.frame(cbind(Return.annualized(stratRets)*100, 
               maxDrawdown(stratRets)*100, 
               SharpeRatio.annualized(stratRets)))

colnames(stats) <- c("Annualized Return", "Max Drawdown", "Annualized Sharpe")
stats$MAR <- as.numeric(stats[1])/as.numeric(stats[2])

With the following result:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          137.7875     45.64011          2.491509 3.019001

Risky, as judging from maximum drawdown alone? Yes. But is there risk for the reward? Absolutely.

To put a lower bound on the performance of the strategy, here are the same diagrams and statistics with the signal lagged one more day (that is, since the returns use close-to-close data and work off of the closing prices, the signal is lagged by a day to avoid lookahead bias. Lagging the signal by one more day would mean receiving the signal at the close of day t, but only entering on the close of day t+1).

In effect, this is the coding change:

vols$signal <- lag(vols$signal, k = 1)

Becomes

vols$signal <- lag(vols$signal, k = 2)

Here are the results:

Equity curve/drawdowns:

So, still impressive on the returns, but now there are some very pronounced drawdowns–slightly larger, but more concerning, that they’re longer.

Log equity curve:

Again, slightly more pronounced dips.

Here are the statistics:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          84.36546     56.77219          1.521165 1.486035

So still quite respectable for a strategy this understandable.

I’m fairly certain that there’s still room for improvement on this strategy, considering that anywhere there’s a 5-day SMA, there’s often room for better trend-following indicators, and possibly a better indication for the volatility of SPY than the 2-day rolling annualized metric. But as it stands, I think based on its risk/reward characteristics, this strategy has a place as an aggressive, returns-generating component of a portfolio of strategies.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Predicting High Yield with SPY–a Two Part Post

This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some volatility trading strategies (the extent of which I hope will end up as a workable trading strategy–my current replica of some of VolatilityMadeSimple’s publicly displayed “example” strategies (note, from other blogs, not to be confused with their proprietary strategy) are something that I think is too risky to be traded as-is), and Cesar Alvarez, of Alvarez Quant Trading. If his name sounds familiar to some of you, that’s because it should. He used to collaborate (still does?) with Larry Connors of TradingMarkets.com, and I’m pretty sure that sometime in the future, I’ll cover those strategies as well.

The strategy for this post is simple, and taken from this post from CSS Analytics.

Pretty straightforward–compute a 20-day SMA on the SPY (I use unadjusted since that’s what the data would have actually been). When the SPY’s close crosses above the 20-day SMA, buy the high-yield bond index, either VWEHX or HYG, and when the converse happens, move to the cash-substitute security, either VFISX or SHY.

Now, while the above paragraph may make it seem that VWEHX and HYG are perfect substitutes, well, they aren’t, as no two instruments are exactly alike, which, as could be noted from my last post, is a detail that one should be mindful of. Even creating a synthetic “equivalent” is never exactly perfect. Even though I try my best to iron out such issues, over the course of generally illustrating an idea, the numbers won’t line up exactly (though hopefully, they’ll be close). In any case, it’s best to leave an asterisk whenever one is forced to use synthetics for the sake of a prolonged backtest.

The other elephant/gorilla in the room (depending on your preference for metaphorical animals), is whether or not to use adjusted data. The upside to that is that dividends are taken into account. The *downside* to that is that the data isn’t the real data, and also assumes a continuous reinvestment of dividends. Unfortunately, shares of a security are not continuous quantities–they are discrete quantities made so by their unit price, so the implicit assumptions in adjusted prices can be optimistic.

For this particular topic, Cesar Alvarez covered it exceptionally well on his blog post, and I highly recommend readers give that post a read, in addition to following his blog in general. However, just to illustrate the effect, let’s jump into the script.


getSymbols("VWEHX", from="1950-01-01")
getSymbols("SPY", from="1900-01-01")
getSymbols("HYG", from="1990-01-01")
getSymbols("SHY", from="1990-01-01")
getSymbols("VFISX", from="1990-01-01")

spySma20Cl <- SMA(Cl(SPY), n=20)
clSig <- Cl(SPY) > spySma20Cl
clSig <- lag(clSig, 1)

vwehxCloseRets <- Return.calculate(Cl(VWEHX))
vfisxCloseRets <- Return.calculate(Cl(VFISX))
vwehxAdjustRets <- Return.calculate(Ad(VWEHX))
vfisxAdjustRets <- Return.calculate(Ad(VFISX))

hygCloseRets <- Return.calculate(Cl(HYG))
shyCloseRets <- Return.calculate(Cl(SHY))
hygAdjustRets <- Return.calculate(Ad(HYG))
shyAdjustRets <- Return.calculate(Ad(SHY))

mutualAdRets <- vwehxAdjustRets*clSig + vfisxAdjustRets*(1-clSig)
mutualClRets <- vwehxCloseRets*clSig + vfisxCloseRets*(1-clSig)

etfAdRets <- hygAdjustRets*clSig + shyAdjustRets*(1-clSig)
etfClRets <- hygCloseRets*clSig + shyCloseRets*(1-clSig)

Here are the results:


mutualFundBacktest <- merge(mutualAdRets, mutualClRets, join='inner')
charts.PerformanceSummary(mutualFundBacktest)
data.frame(t(rbind(Return.annualized(mutualFundBacktest)*100, 
                   maxDrawdown(mutualFundBacktest)*100,
                   SharpeRatio.annualized(mutualFundBacktest))))

Which produces the following equity curves:

As can be seen, the choice to adjust or not can be pretty enormous. Here are the corresponding three statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         14.675379       2.954519                        3.979383
VWEHX.Close             7.794086       4.637520                        3.034225

Even without the adjustment, the strategy itself is…very very good, at least from this angle. Let’s look at the ETF variant now.

etfBacktest <- merge(etfAdRets, etfClRets, join='inner')
charts.PerformanceSummary(etfBacktest)
data.frame(t(rbind(Return.annualized(etfBacktest)*100, 
                   maxDrawdown(etfBacktest)*100,
                   SharpeRatio.annualized(etfBacktest))))

The resultant equity curve:

With the corresponding statistics:

             Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
HYG.Adjusted         11.546005       6.344801                       1.4674301
HYG.Close             5.530951       9.454754                       0.6840059

Again, another stark difference. Let’s combine all four variants.

fundsAndETFs <- merge(mutualFundBacktest, etfBacktest, join='inner')
charts.PerformanceSummary(fundsAndETFs)
data.frame(t(rbind(Return.annualized(fundsAndETFs)*100, 
                   maxDrawdown(fundsAndETFs)*100,
                   SharpeRatio.annualized(fundsAndETFs))))

The equity curve:

With the resulting statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         17.424070       2.787889                       4.7521579
VWEHX.Close            11.739849       3.169040                       3.8715923
HYG.Adjusted           11.546005       6.344801                       1.4674301
HYG.Close               5.530951       9.454754                       0.6840059

In short, while the strategy itself seems strong, the particular similar (but not identical) instruments used to implement the strategy make a large difference. So, when backtesting, make sure to understand what taking liberties with the data means. In this case, by turning two levers, the Sharpe Ratio varied from less than 1 to above 4.

Next, I’d like to demonstrate a little trick in quantstrat. Although plenty of examples of trading strategies only derive indicators (along with signals and rules) from the market data itself, there are also many strategies that incorporate data from outside simply the price action of the particular security at hand. Such examples would be many SPY strategies that incorporate VIX information, or off-instrument signal strategies like this one.

The way to incorporate off-instrument information into quantstrat simply requires understanding what the mktdata object is, which is nothing more than an xts type object. By default, a security may originally have just the OHLCV and open interest columns. Most demos in the public space generally use data only from the instruments themselves. However, it is very much possible to actually pre-compute signals.

Here’s a continuation of the script to demonstrate, with a demo of the unadjusted HYG leg of this trade:

####### BOILERPLATE FROM HERE
require(quantstrat)

currency('USD')
Sys.setenv(TZ="UTC")
symbols <- "HYG"
stock(symbols, currency="USD", multiplier=1)
initDate="1990-01-01"

strategy.st <- portfolio.st <- account.st <- "preCalc"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD')
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)
######### TO HERE

clSig <- Cl(SPY) > SMA(Cl(SPY), n=20)

HYG <- merge(HYG, clSig, join='inner')
names(HYG)[7] <- "precomputed_signal"

#no parameters or indicators--we precalculated our signal

add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="gt", cross=TRUE),
           label="longEntry")


add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="lt", cross=TRUE),
           label="longExit")

add.rule(strategy.st, name="ruleSignal",
         arguments=list(sigcol="longEntry", sigval=TRUE, orderqty=1, ordertype="market",
                        orderside="long", replace=FALSE, prefer="Open"),
         type="exit", path.dep=TRUE)

add.rule(strategy.st, name="ruleSignal", 
         arguments=list(sigcol="longExit", sigval=TRUE, orderqty="all", ordertype="market", 
                        orderside="long", replace=FALSE, prefer="Open"), 
         type="exit", path.dep=TRUE)

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)

#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)

As you can see, no indicators computed from the actual market data, because the strategy used a pre-computed value to work off of. The lowest-hanging fruit of applying this methodology, of course, would be to append the VIX index as an indicator for trading strategies on the SPY.

And here are the results, trading a unit quantity:

> data.frame(round(t(tradeStats(portfolio.st)[-c(1,2)]),2))
                      HYG
Num.Txns           217.00
Num.Trades         106.00
Net.Trading.PL      36.76
Avg.Trade.PL         0.35
Med.Trade.PL         0.01
Largest.Winner       9.83
Largest.Loser       -2.71
Gross.Profits       67.07
Gross.Losses       -29.87
Std.Dev.Trade.PL     1.67
Percent.Positive    50.00
Percent.Negative    50.00
Profit.Factor        2.25
Avg.Win.Trade        1.27
Med.Win.Trade        0.65
Avg.Losing.Trade    -0.56
Med.Losing.Trade    -0.39
Avg.Daily.PL         0.35
Med.Daily.PL         0.01
Std.Dev.Daily.PL     1.67
Ann.Sharpe           3.33
Max.Drawdown        -7.24
Profit.To.Max.Draw   5.08
Avg.WinLoss.Ratio    2.25
Med.WinLoss.Ratio    1.67
Max.Equity          43.78
Min.Equity          -1.88
End.Equity          36.76

And the corresponding position chart:

Lastly, here are the vanguard links for VWEHX and VFISX. Apparently, neither charge a redemption fee. I’m not sure if this means that they can be freely traded in a systematic fashion, however.

In conclusion, hopefully this post showed a potentially viable strategy, understanding the nature of the data you’re working with, and how to pre-compute values in quantstrat.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.