An Out of Sample Update on DDN’s Volatility Momentum Trading Strategy and Beta Convexity

The first part of this post is a quick update on Tony Cooper’s of Double Digit Numerics’s volatility ETN momentum strategy from the volatility made simple blog (which has stopped updating as of a year and a half ago). The second part will cover Dr. Jonathan Kinlay’s Beta Convexity concept.

So, now that I have the ability to generate a term structure and constant expiry contracts, I decided to revisit some of the strategies on Volatility Made Simple and see if any of them are any good (long story short: all of the publicly detailed ones aren’t so hot besides mine–they either have a massive drawdown in-sample around the time of the crisis, or a massive drawdown out-of-sample).

Why this strategy? Because it seemed different from most of the usual term structure ratio trades (of which mine is an example), so I thought I’d check out how it did since its first publishing date, and because it’s rather easy to understand.

Here’s the strategy:

Take XIV, VXX, ZIV, VXZ, and SHY (this last one as the “risk free” asset), and at the close, invest in whichever has had the highest 83 day momentum (this was the result of optimization done on volatilityMadeSimple).

Here’s the code to do this in R, using the Quandl EOD database. There are two variants tested–observe the close, buy the close (AKA magical thinking), and observe the close, buy tomorrow’s close.

require(quantmod)
require(PerformanceAnalytics)
require(TTR)
require(Quandl)

Quandl.api_key("yourKeyHere")

symbols <- c("XIV", "VXX", "ZIV", "VXZ", "SHY")

prices <- list()
for(i in 1:length(symbols)) {
  price <- Quandl(paste0("EOD/", symbols[i]), start_date="1990-12-31", type = "xts")$Adj_Close
  colnames(price) <- symbols[i]
  prices[[i]] <- price
}
prices <- na.omit(do.call(cbind, prices))
returns <- na.omit(Return.calculate(prices))

# find highest asset, assign column names
topAsset <- function(row, assetNames) {
  out <- row==max(row, na.rm = TRUE)
  names(out) <- assetNames
  out <- data.frame(out)
  return(out)
}

# compute momentum
momentums <- na.omit(xts(apply(prices, 2, ROC, n = 83), order.by=index(prices)))

# find highest asset each day, turn it into an xts
highestMom <- apply(momentums, 1, topAsset, assetNames = colnames(momentums))
highestMom <- xts(t(do.call(cbind, highestMom)), order.by=index(momentums))

# observe today's close, buy tomorrow's close
buyTomorrow <- na.omit(xts(rowSums(returns * lag(highestMom, 2)), order.by=index(highestMom)))

# observe today's close, buy today's close (aka magic thinking)
magicThinking <- na.omit(xts(rowSums(returns * lag(highestMom)), order.by=index(highestMom)))

out <- na.omit(cbind(buyTomorrow, magicThinking))
colnames(out) <- c("buyTomorrow", "magicalThinking")

# results
charts.PerformanceSummary(out['2014-04-11::'], legend.loc = 'top')
rbind(table.AnnualizedReturns(out['2014-04-11::']), maxDrawdown(out['2014-04-11::']))

Pretty simple.

Here are the results.

capture

> rbind(table.AnnualizedReturns(out['2014-04-11::']), maxDrawdown(out['2014-04-11::']))
                          buyTomorrow magicalThinking
Annualized Return          -0.0320000       0.0378000
Annualized Std Dev          0.5853000       0.5854000
Annualized Sharpe (Rf=0%)  -0.0547000       0.0646000
Worst Drawdown              0.8166912       0.7761655

Looks like this strategy didn’t pan out too well. Just a daily reminder that if you’re using fine grid-search to select a particularly good parameter (EG n = 83 days? Maybe 4 21-day trading months, but even that would have been n = 82), you’re asking for a visit from, in the words of Mr. Tony Cooper, a visit from the grim reaper.

****

Moving onto another topic, whenever Dr. Jonathan Kinlay posts something that I think I can replicate that I’d be very wise to do so, as he is a very skilled and experienced practitioner (and also includes me on his blogroll).

A topic that Dr. Kinlay covered is the idea of beta convexity–namely, that an asset’s beta to a benchmark may be different when the benchmark is up as compared to when it’s down. Essentially, it’s the idea that we want to weed out firms that are what I’d deem as “losers in disguise”–I.E. those that act fine when times are good (which is when we really don’t care about diversification, since everything is going up anyway), but do nothing during bad times.

The beta convexity is calculated quite simply: it’s the beta of an asset to a benchmark when the benchmark has a positive return, minus the beta of an asset to a benchmark when the benchmark has a negative return, then squaring the difference. That is, (beta_bench_positive – beta_bench_negative) ^ 2.

Here’s some R code to demonstrate this, using IBM vs. the S&P 500 since 1995.

ibm <- Quandl("EOD/IBM", start_date="1995-01-01", type = "xts")
ibmRets <- Return.calculate(ibm$Adj_Close)

spy <- Quandl("EOD/SPY", start_date="1995-01-01", type = "xts")
spyRets <- Return.calculate(spy$Adj_Close)

rets <- na.omit(cbind(ibmRets, spyRets))
colnames(rets) <- c("IBM", "SPY")

betaConvexity <- function(Ra, Rb) {
  positiveBench <- Rb[Rb > 0]
  assetPositiveBench <- Ra[index(positiveBench)]
  positiveBeta <- CAPM.beta(Ra = assetPositiveBench, Rb = positiveBench)
  
  negativeBench <- Rb[Rb < 0]
  assetNegativeBench <- Ra[index(negativeBench)]
  negativeBeta <- CAPM.beta(Ra = assetNegativeBench, Rb = negativeBench)
  
  out <- (positiveBeta - negativeBeta) ^ 2
  return(out)
}

betaConvexity(rets$IBM, rets$SPY)

For the result:

> betaConvexity(rets$IBM, rets$SPY)
[1] 0.004136034

Thanks for reading.

NOTE: I am always looking to network, and am currently actively looking for full-time opportunities which may benefit from my skill set. If you have a position which may benefit from my skills, do not hesitate to reach out to me. My LinkedIn profile can be found here.

Testing the Hierarchical Risk Parity algorithm

This post will be a modified backtest of the Adaptive Asset Allocation backtest from AllocateSmartly, using the Hierarchical Risk Parity algorithm from last post, because Adam Butler was eager to see my results. On a whole, as Adam Butler had told me he had seen, HRP does not generate outperformance when applied to a small, carefully-constructed, diversified-by-selection universe of asset classes, as opposed to a universe of hundreds or even several thousand assets, where its theoretically superior properties result in it being a superior algorithm.

First off, I would like to thank one Matthew Barry, for helping me modify my HRP algorithm so as to not use the global environment for recursion. You can find his github here.

Here is the modified HRP code.

covMat <- read.csv('cov.csv', header = FALSE)
corMat <- read.csv('corMat.csv', header = FALSE)

clustOrder <- hclust(dist(corMat), method = 'single')$order

getIVP <- function(covMat) {
  invDiag <- 1/diag(as.matrix(covMat))
  weights <- invDiag/sum(invDiag)
  return(weights)
}

getClusterVar <- function(covMat, cItems) {
  covMatSlice <- covMat[cItems, cItems]
  weights <- getIVP(covMatSlice)
  cVar <- t(weights) %*% as.matrix(covMatSlice) %*% weights
  return(cVar)
}

getRecBipart <- function(covMat, sortIx) {
  w <- rep(1,ncol(covMat))
  w <- recurFun(w, covMat, sortIx)
  return(w)
}

recurFun <- function(w, covMat, sortIx) {
  subIdx <- 1:trunc(length(sortIx)/2)
  cItems0 <- sortIx[subIdx]
  cItems1 <- sortIx[-subIdx]
  cVar0 <- getClusterVar(covMat, cItems0)
  cVar1 <- getClusterVar(covMat, cItems1)
  alpha <- 1 - cVar0/(cVar0 + cVar1)
  
  # scoping mechanics using w as a free parameter
  w[cItems0] <- w[cItems0] * alpha
  w[cItems1] <- w[cItems1] * (1-alpha)
  
  if(length(cItems0) > 1) {
    w <- recurFun(w, covMat, cItems0)
  }
  if(length(cItems1) > 1) {
    w <- recurFun(w, covMat, cItems1)
  }
  return(w)
}


out <- getRecBipart(covMat, clustOrder)
out

With covMat and corMat being from the last post. In fact, this function can be further modified by encapsulating the clustering order within the getRecBipart function, but in the interest of keeping the code as similar to Marcos Lopez de Prado’s code as I could, I’ll leave this here.

Anyhow, the backtest will follow. One thing I will mention is that I’m using Quandl’s EOD database, as Yahoo has really screwed up their financial database (I.E. some sector SPDRs have broken data, dividends not adjusted, etc.). While this database is a $50/month subscription, I believe free users can access it up to 150 times in 60 days, so that should be enough to run backtests from this blog, so long as you save your downloaded time series for later use by using write.zoo.

This code needs the tseries library for the portfolio.optim function for the minimum variance portfolio (Dr. Kris Boudt has a course on this at datacamp), and the other standard packages.

A helper function for this backtest (and really, any other momentum rotation backtest) is the appendMissingAssets function, which simply adds on assets not selected to the final weighting and re-orders the weights by the original ordering.

require(tseries)
require(PerformanceAnalytics)
require(quantmod)
require(Quandl)

Quandl.api_key("YOUR_AUTHENTICATION_HERE") # not displaying my own api key, sorry 😦

# function to append missing (I.E. assets not selected) asset names and sort into original order
appendMissingAssets <- function(wts, allAssetNames, wtsDate) {
  absentAssets <- allAssetNames[!allAssetNames %in% names(wts)]
  absentWts <- rep(0, length(absentAssets))
  names(absentWts) <- absentAssets
  wts <- c(wts, absentWts)
  wts <- xts(t(wts), order.by=wtsDate)
  wts <- wts[,allAssetNames]
  return(wts)
}

Next, we make the call to Quandl to get our data.

symbols <- c("SPY", "VGK",	"EWJ",	"EEM",	"VNQ",	"RWX",	"IEF",	"TLT",	"DBC",	"GLD")	

rets <- list()
for(i in 1:length(symbols)) {
  
  # quandl command to download from EOD database. Free users should use write.zoo in this loop.
  
  returns <- Return.calculate(Quandl(paste0("EOD/", symbols[i]), start_date="1990-12-31", type = "xts")$Adj_Close)
  colnames(returns) <- symbols[i]
  rets[[i]] <- returns
}
rets <- na.omit(do.call(cbind, rets))

While Josh Ulrich fixed quantmod to actually get Yahoo data after Yahoo broke the API, the problem is that the Yahoo data is now garbage as well, and I’m not sure how much Josh Ulrich can do about that. I really hope some other provider can step up and provide free, usable EOD data so that I don’t have to worry about readers not being able to replicate the backtest, as my policy for this blog is that readers should be able to replicate the backtests so they don’t just nod and take my word for it. If you are or know of such a provider, please leave a comment so that I can let the blog readers know all about you.

Next, we initialize the settings for the backtest.

invVolWts <- list()
minVolWts <- list()
hrpWts <- list()
ep <- endpoints(rets, on =  "months")
nMonths = 6 # month lookback (6 as per parameters from allocateSmartly)
nVol = 20 # day lookback for volatility (20 ibid)

While the AAA backtest actually uses a 126 day lookback instead of a 6 month lookback, as it trades at the end of every month, that’s effectively a 6 month lookback, give or take a few days out of 126, but the code is less complex this way.

Next, we have our actual backtest.

for(i in 1:(length(ep)-nMonths)) {
  
  # get returns subset and compute absolute momentum
  retSubset <- rets[c(ep[i]:ep[(i+nMonths)]),]
  retSubset <- retSubset[-1,]
  moms <- Return.cumulative(retSubset)
  
  # select top performing assets and subset returns for them
  highRankAssets <- rank(moms) >= 6 # top 5 assets
  posReturnAssets <- moms > 0 # positive momentum assets
  selectedAssets <- highRankAssets & posReturnAssets # intersection of the above
  selectedSubset <- retSubset[,selectedAssets] # subset returns slice
  
  if(sum(selectedAssets)==0) { # if no qualifying assets, zero weight for period
    
    wts <- xts(t(rep(0, ncol(retSubset))), order.by=last(index(retSubset)))
    colnames(wts) <- colnames(retSubset)
    invVolWts[[i]] <- minVolWts[[i]] <- hrpWts[[i]] <- wts
    
  } else if (sum(selectedAssets)==1) { # if one qualifying asset, invest fully into it
    
    wts <- xts(t(rep(0, ncol(retSubset))), order.by=last(index(retSubset)))
    colnames(wts) <- colnames(retSubset)
    wts[, which(selectedAssets==1)] <- 1
    invVolWts[[i]] <- minVolWts[[i]] <- hrpWts[[i]] <- wts
    
  } else { # otherwise, use weighting algorithms
    
    cors <- cor(selectedSubset) # correlation
    volSubset <- tail(selectedSubset, nVol) # 20 day volatility
    vols <- StdDev(volSubset)
    covs <- t(vols) %*% vols * cors
    
    # minimum volatility using portfolio.optim from tseries
    minVolRets <- t(matrix(rep(1, sum(selectedAssets))))
    minVolWt <- portfolio.optim(x=minVolRets, covmat = covs)$pw
    names(minVolWt) <- colnames(covs)
    minVolWt <- appendMissingAssets(minVolWt, colnames(retSubset), last(index(retSubset)))
    minVolWts[[i]] <- minVolWt
    
    # inverse volatility weights
    invVols <- 1/vols 
    invVolWt <- invVols/sum(invVols) 
    invNames <- colnames(invVolWt)
    invVolWt <- as.numeric(invVolWt) 
    names(invVolWt) <- invNames
    invVolWt <- appendMissingAssets(invVolWt, colnames(retSubset), last(index(retSubset)))
    invVolWts[[i]] <- invVolWt
    
    # hrp weights
    clustOrder <- hclust(dist(cors), method = 'single')$order
    hrpWt <- getRecBipart(covs, clustOrder)
    names(hrpWt) <- colnames(covs)
    hrpWt <- appendMissingAssets(hrpWt, colnames(retSubset), last(index(retSubset)))
    hrpWts[[i]] <- hrpWt
  }
}

In a few sentences, this is what happens:

The algorithm takes a subset of the returns (the past six months at every month), and computes absolute momentum. It then ranks the ten absolute momentum calculations, and selects the intersection of the top 5, and those with a return greater than zero (so, a dual momentum calculation).

If no assets qualify, the algorithm invests in nothing. If there’s only one asset that qualifies, the algorithm invests in that one asset. If there are two or more qualifying assets, the algorithm computes a covariance matrix using 20 day volatility multiplied with a 126 day correlation matrix (that is, sd_20′ %*% sd_20 * (elementwise) cor_126. It then computes normalized inverse volatility weights using the volatility from the past 20 days, a minimum variance portfolio with the portfolio.optim function, and lastly, the hierarchical risk parity weights using the HRP code above from Marcos Lopez de Prado’s paper.

Lastly, the program puts together all of the weights, and adds a cash investment for any period without any investments.

invVolWts <- round(do.call(rbind, invVolWts), 3) # round for readability
minVolWts <- round(do.call(rbind, minVolWts), 3)
hrpWts <- round(do.call(rbind, hrpWts), 3)

# allocate to cash if no allocation made due to all negative momentum assets
invVolWts$cash <- 0; invVolWts$cash <- 1-rowSums(invVolWts)
hrpWts$cash <- 0; hrpWts$cash <- 1-rowSums(hrpWts)
minVolWts$cash <- 0; minVolWts$cash <- 1-rowSums(minVolWts)

# cash value will be zero
rets$cash <- 0

# compute backtest returns
invVolRets <- Return.portfolio(R = rets, weights = invVolWts)
minVolRets <- Return.portfolio(R = rets, weights = minVolWts)
hrpRets <- Return.portfolio(R = rets, weights = hrpWts)

Here are the results:

compare <- cbind(invVolRets, minVolRets, hrpRets)
colnames(compare) <- c("invVol", "minVol", "HRP")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))  
                             invVol    minVol       HRP
Annualized Return         0.0872000 0.0724000 0.0792000
Annualized Std Dev        0.1208000 0.1025000 0.1136000
Annualized Sharpe (Rf=0%) 0.7221000 0.7067000 0.6968000
Worst Drawdown            0.1548801 0.1411368 0.1593287
Calmar Ratio              0.5629882 0.5131956 0.4968234

In short, in the context of a small, carefully-selected and allegedly diversified (I’ll let Adam Butler speak for that one) universe dominated by the process of which assets to invest in as opposed to how much, the theoretical upsides of an algorithm which simultaneously exploits a covariance structure without needing to invert a covariance matrix can be lost.

However, this test (albeit from 2007 onwards, thanks to ETF inception dates combined with lookback burn-in) confirms what Adam Butler himself told me, which is that HRP hasn’t impressed him, and from this backtest, I can see why. However, in the context of dual momentum rank selection, I’m not convinced that any weighting scheme will realize much better performance than any other.

Thanks for reading.

NOTE: I am always interested in networking and hearing about full-time opportunities related to my skill set. My linkedIn profile can be found here.

A Review of Gary Antonacci’s Dual Momentum Investing Book

This review is a book review of Gary Antonacci’s Dual Momentum Investing book.

The TL;DR: 4.5 out of 5 stars.

So, I honestly have very little criticism of the book beyond the fact that the book sort of insinuates as though equity momentum is the be-all-end-all of investing, which is why I deduct a fraction of a point.

Now, for the book itself: first off, unlike other quantitative trading books I’ve read (aside from Andreas Clenow’s), the book outlines a very simple to follow strategy, to the point that it has already been replicated over at AllocateSmartly. (Side note: I think Walter’s resource at Allocate Smartly is probably the single best one-stop shop for reading up on any tactical asset allocation strategy, as it’s a compendium of many strategies in the risk/return profile of the 7-15% CAGR type strategies, and even has a correlation matrix between them all.)

Regarding the rest of the content, Antonacci does a very thorough job of stepping readers through the history/rationale of momentum, and not just that, but also addressing the alternatives to his strategy.

While the “why momentum works” aspect you can find in this book and others on the subject (I.E. Alpha Architect’s Quantitative Momentum book), I do like the section on other alternative assets. Namely, the book touches upon the fact that commodities no longer trend, so a lot of CTAs are taking it on the chin, and that historically, fixed income has performed much worse from an absolute return than equities. Furthermore, smart beta isn’t (smart), and many of these factors have very low aggregate returns (if they’re significant at all, I believe Wesley Gray at Alpha Architect has a blog post stating that they aren’t). There are a fair amount of footnotes for those interested in corroborating the assertions. Suffice to say, when it comes to strategies that don’t need daily micromanagement, when it comes to how far you can get without leverage (essentially, anything outside the space of volatility trading strategies), equity momentum is about as good as you get.

Antonacci then introduces his readers to his GEM (Global Equities Momentum) strategy, which can be explained in a few sentences: at the end of each month, calculate the total 12-month return of SPY, EAFE, and BIL. If BIL has the highest return, buy AGG for that month, otherwise buy the asset with the highest return. Repeat. That’s literally it, and the performance characteristics, on a risk-adjusted basis, are superior to just about any equity fund tied down to a tiny tracking error. Essentially, the reason for that is that equity markets have bear markets, and a dual momentum strategy lets you preserve your gains instead of giving it back in market corrections (I.E. 2000-2003, 2008, etc.) while keeping pace during the good times.

Lastly, Antonacci provides some ideas for possibly improving on GEM. I may examine on these in the future. However, the low-hanging fruit for improving on this strategy, in my opinion, is to find some other strategies that diversify its drawdowns, and raise its risk-adjusted return profile. Even if the total return goes down, I believe that an interactive brokers account can offer some amount of leverage (either 50% or 100%) to boost the total returns back up, or combine a more diversified portfolio with a volatility strategy.

Lastly, the appendix includes the original dual momentum paper, and a monthly return table for GEM going back to 1974.

All in all, this book is about as accessible and comprehensive as you can get on a solid strategy that readers actually *can* implement themselves in their brokerage account of choice (please use IB or Robinhood because there’s no point paying $8-$10 per trade if you’re retail). That said, I still think that there are venues in which to travel if you’re looking to improve your overall portfolio with GEM as a foundation.

Thanks for reading.

NOTEL I am always interested in networking and hearing about full-time roles which can benefit from my skill set. My linkedin profile can be found here.

The Problem With Depmix For Online Regime Prediction

This post will be about attempting to use the Depmix package for online state prediction. While the depmix package performs admirably when it comes to describing the states of the past, when used for one-step-ahead prediction, under the assumption that tomorrow’s state will be identical to today’s, the hidden markov model process found within the package does not perform to expectations.

So, to start off, this post was motivated by Michael Halls-Moore, who recently posted some R code about using the depmixS4 library to use hidden markov models. Generally, I am loath to create posts on topics I don’t feel I have an absolutely front-to-back understanding of, but I’m doing this in the hope of learning from others on how to appropriately do online state-space prediction, or “regime switching” detection, as it may be called in more financial parlance.

Here’s Dr. Halls-Moore’s post.

While I’ve seen the usual theory of hidden markov models (that is, it can rain or it can be sunny, but you can only infer the weather judging by the clothes you see people wearing outside your window when you wake up), and have worked with toy examples in MOOCs (Udacity’s self-driving car course deals with them, if I recall correctly–or maybe it was the AI course), at the end of the day, theory is only as good as how well an implementation can work on real data.

For this experiment, I decided to take SPY data since inception, and do a full in-sample “backtest” on the data. That is, given that the HMM algorithm from depmix sees the whole history of returns, with this “god’s eye” view of the data, does the algorithm correctly classify the regimes, if the backtest results are any indication?

Here’s the code to do so, inspired by Dr. Halls-Moore’s.

require(depmixS4)
require(quantmod)
getSymbols('SPY', from = '1990-01-01', src='yahoo', adjust = TRUE)
spyRets <- na.omit(Return.calculate(Ad(SPY)))

set.seed(123)

hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data=spyRets)
hmmfit <- fit(hmm, verbose = FALSE)
post_probs <- posterior(hmmfit)
post_probs <- xts(post_probs, order.by=index(spyRets))
plot(post_probs$state)
summaryMat <- data.frame(summary(hmmfit))
colnames(summaryMat) <- c("Intercept", "SD")
bullState <- which(summaryMat$Intercept > 0)
bearState <- which(summaryMat$Intercept < 0)

hmmRets <- spyRets * lag(post_probs$state == bullState) - spyRets * lag(post_probs$state == bearState)
charts.PerformanceSummary(hmmRets)
table.AnnualizedReturns(hmmRets)

Essentially, while I did select three states, I noted that anything with an intercept above zero is a bull state, and below zero is a bear state, so essentially, it reduces to two states.

With the result:

table.AnnualizedReturns(hmmRets)
                          SPY.Adjusted
Annualized Return               0.1355
Annualized Std Dev              0.1434
Annualized Sharpe (Rf=0%)       0.9448

So, not particularly terrible. The algorithm works, kind of, sort of, right?

Well, let’s try online prediction now.

require(DoMC)

dailyHMM <- function(data, nPoints) {
  subRets <- data[1:nPoints,]
  hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data = subRets)
  hmmfit <- fit(hmm, verbose = FALSE)
  post_probs <- posterior(hmmfit)
  summaryMat <- data.frame(summary(hmmfit))
  colnames(summaryMat) <- c("Intercept", "SD")
  bullState <- which(summaryMat$Intercept > 0)
  bearState <- which(summaryMat$Intercept < 0)
  if(last(post_probs$state) %in% bullState) {
    state <- xts(1, order.by=last(index(subRets)))
  } else if (last(post_probs$state) %in% bearState) {
    state <- xts(-1, order.by=last(index(subRets)))
  } else {
    state <- xts(0, order.by=last(index(subRets)))
  }
  colnames(state) <- "State"
  return(state)
}

# took 3 hours in parallel
t1 <- Sys.time()
set.seed(123)
registerDoMC((detectCores() - 1))
states <- foreach(i = 500:nrow(spyRets), .combine=rbind) %dopar% {
  dailyHMM(data = spyRets, nPoints = i)
}
t2 <- Sys.time()
print(t2-t1)

So what I did here was I took an expanding window, starting from 500 days since SPY’s inception, and kept increasing it, by one day at a time. My prediction, was, trivially enough, the most recent day, using a 1 for a bull state, and a -1 for a bear state. I ran this process in parallel (on a linux cluster, because windows’s doParallel library seems to not even know that certain packages are loaded, and it’s more messy), and the first big issue is that this process took about three hours on seven cores for about 23 years of data. Not exactly encouraging, but computing time isn’t expensive these days.

So let’s see if this process actually works.

First, let’s test if the algorithm does what it’s actually supposed to do and use one day of look-ahead bias (that is, the algorithm tells us the state at the end of the day–how correct is it even for that day?).


onlineRets <- spyRets * states 
charts.PerformanceSummary(onlineRets)
table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets)
                          SPY.Adjusted
Annualized Return               0.2216
Annualized Std Dev              0.1934
Annualized Sharpe (Rf=0%)       1.1456

So, allegedly, the algorithm seems to do what it was designed to do, which is to classify a state for a given data set. Now, the most pertinent question: how well do these predictions do even one day ahead? You’d think that state space predictions would be parsimonious from day to day, given the long history, correct?


onlineRets <- spyRets * lag(states)
charts.PerformanceSummary(onlineRets)
table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets)
                          SPY.Adjusted
Annualized Return               0.0172
Annualized Std Dev              0.1939
Annualized Sharpe (Rf=0%)       0.0888

That is, without the lookahead bias, the state space prediction algorithm is atrocious. Why is that?

Well, here’s the plot of the states:

In short, the online hmm algorithm in the depmix package seems to change its mind very easily, with obvious (negative) implications for actual trading strategies.

So, that wraps it up for this post. Essentially, the main message here is this: there’s a vast difference between loading doing descriptive analysis (AKA “where have you been, why did things happen”) vs. predictive analysis (that is, “if I correctly predict the future, I get a positive payoff”). In my opinion, while descriptive statistics have their purpose in terms of explaining why a strategy may have performed how it did, ultimately, we’re always looking for better prediction tools. In this case, depmix, at least in this “out-of-the-box” demonstration does not seem to be the tool for that.

If anyone has had success with using depmix (or other regime-switching algorithm in R) for prediction, I would love to see work that details the procedure taken, as it’s an area I’m looking to expand my toolbox into, but don’t have any particular good leads. Essentially, I’d like to think of this post as me describing my own experiences with the package.

Thanks for reading.

NOTE: On Oct. 5th, I will be in New York City. On Oct. 6th, I will be presenting at The Trading Show on the Programming Wars panel.

NOTE: My current analytics contract is up for review at the end of the year, so I am officially looking for other offers as well. If you have a full-time role which may benefit from the skills you see on my blog, please get in touch with me. My linkedin profile can be found here.

How To Compute Turnover With Return.Portfolio in R

This post will demonstrate how to take into account turnover when dealing with returns-based data using PerformanceAnalytics and the Return.Portfolio function in R. It will demonstrate this on a basic strategy on the nine sector SPDRs.

So, first off, this is in response to a question posed by one Robert Wages on the R-SIG-Finance mailing list. While there are many individuals out there with a plethora of questions (many of which can be found to be demonstrated on this blog already), occasionally, there will be an industry veteran, a PhD statistics student from Stanford, or other very intelligent individual that will ask a question on a topic that I haven’t yet touched on this blog, which will prompt a post to demonstrate another technical aspect found in R. This is one of those times.

So, this demonstration will be about computing turnover in returns space using the PerformanceAnalytics package. Simply, outside of the PortfolioAnalytics package, PerformanceAnalytics with its Return.Portfolio function is the go-to R package for portfolio management simulations, as it can take a set of weights, a set of returns, and generate a set of portfolio returns for analysis with the rest of PerformanceAnalytics’s functions.

Again, the strategy is this: take the 9 three-letter sector SPDRs (since there are four-letter ETFs now), and at the end of every month, if the adjusted price is above its 200-day moving average, invest into it. Normalize across all invested sectors (that is, 1/9th if invested into all 9, 100% into 1 if only 1 invested into, 100% cash, denoted with a zero return vector, if no sectors are invested into). It’s a simple, toy strategy, as the strategy isn’t the point of the demonstration.

Here’s the basic setup code:

require(TTR)
require(PerformanceAnalytics)
require(quantmod)

symbols <- c("XLF", "XLK", "XLU", "XLE", "XLP", "XLF", "XLB", "XLV", "XLY")
getSymbols(symbols, src='yahoo', from = '1990-01-01-01')
prices <- list()
for(i in 1:length(symbols)) {
  tmp <- Ad(get(symbols[[i]]))
  prices[[i]] <- tmp
}
prices <- do.call(cbind, prices)

# Our signal is a simple adjusted price over 200 day SMA
signal <- prices > xts(apply(prices, 2, SMA, n = 200), order.by=index(prices))

# equal weight all assets with price above SMA200
returns <- Return.calculate(prices)
weights <- signal/(rowSums(signal)+1e-16)

# With Return.portfolio, need all weights to sum to 1
weights$zeroes <- 1 - rowSums(weights)
returns$zeroes <- 0

monthlyWeights <- na.omit(weights[endpoints(weights, on = 'months'),])
weights <- na.omit(weights)
returns <- na.omit(returns)

So, get the SPDRs, put them together, compute their returns, generate the signal, and create the zero vector, since Return.Portfolio treats weights less than 1 as a withdrawal, and weights above 1 as the addition of more capital (big FYI here).

Now, here’s how to compute turnover:

out <- Return.portfolio(R = returns, weights = monthlyWeights, verbose = TRUE)
beginWeights <- out$BOP.Weight
endWeights <- out$EOP.Weight
txns <- beginWeights - lag(endWeights)
monthlyTO <- xts(rowSums(abs(txns[,1:9])), order.by=index(txns))
plot(monthlyTO)

So, the trick is this: when you call Return.portfolio, use the verbose = TRUE option. This creates several objects, among them returns, BOP.Weight, and EOP.Weight. These stand for Beginning Of Period Weight, and End Of Period Weight.

The way that turnover is computed is simply the difference between how the day’s return moves the allocated portfolio from its previous ending point to where that portfolio actually stands at the beginning of next period. That is, the end of period weight is the beginning of period drift after taking into account the day’s drift/return for that asset. The new beginning of period weight is the end of period weight plus any transacting that would have been done. Thus, in order to find the actual transactions (or turnover), one subtracts the previous end of period weight from the beginning of period weight.

This is what such transactions look like for this strategy.

Something we can do with such data is take a one-year rolling turnover, accomplished with the following code:

yearlyTO <- runSum(monthlyTO, 252)
plot(yearlyTO, main = "running one year turnover")

It looks like this:

This essentially means that one year’s worth of two-way turnover (that is, if selling an entirely invested portfolio is 100% turnover, and buying an entirely new set of assets is another 100%, then two-way turnover is 200%) is around 800% at maximum. That may be pretty high for some people.

Now, here’s the application when you penalize transaction costs at 20 basis points per percentage point traded (that is, it costs 20 cents to transact $100).

txnCosts <- monthlyTO * -.0020
retsWithTxnCosts <- out$returns + txnCosts
compare <- na.omit(cbind(out$returns, retsWithTxnCosts))
colnames(compare) <- c("NoTxnCosts", "TxnCosts20BPs")
charts.PerformanceSummary(compare)
table.AnnualizedReturns(compare)

And the result:


                          NoTxnCosts TxnCosts20BPs
Annualized Return             0.0587        0.0489
Annualized Std Dev            0.1554        0.1553
Annualized Sharpe (Rf=0%)     0.3781        0.3149

So, at 20 basis points on transaction costs, that takes about one percent in returns per year out of this (admittedly, terrible) strategy. This is far from negligible.

So, that is how you actually compute turnover and transaction costs. In this case, the transaction cost model was very simple. However, given that Return.portfolio returns transactions at the individual asset level, one could get as complex as they would like with modeling the transaction costs.

Thanks for reading.

NOTE: I will be giving a lightning talk at R/Finance, so for those attending, you’ll be able to find me there.

Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

This post will outline an easy-to-make mistake in writing vectorized backtests–namely in using a signal obtained at the end of a period to enter (or exit) a position in that same period. The difference in results one obtains is massive.

Today, I saw two separate posts from Alpha Architect and Mike Harris both referencing a paper by Valeriy Zakamulin on the fact that some previous trend-following research by Glabadanidis was done with shoddy results, and that Glabadanidis’s results were only reproducible through instituting lookahead bias.

The following code shows how to reproduce this lookahead bias.

First, the setup of a basic moving average strategy on the S&P 500 index from as far back as Yahoo data will provide.

require(quantmod)
require(xts)
require(TTR)
require(PerformanceAnalytics)

getSymbols('^GSPC', src='yahoo', from = '1900-01-01')
monthlyGSPC <- Ad(GSPC)[endpoints(GSPC, on = 'months')]

# change this line for signal lookback
movAvg <- SMA(monthlyGSPC, 10)

signal <- monthlyGSPC > movAvg
gspcRets <- Return.calculate(monthlyGSPC)

And here is how to institute the lookahead bias.

lookahead <- signal * gspcRets
correct <- lag(signal) * gspcRets

These are the “results”:

compare <- na.omit(cbind(gspcRets, lookahead, correct))
colnames(compare) <- c("S&P 500", "Lookahead", "Correct")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
logRets <- log(cumprod(1+compare))
chart.TimeSeries(logRets, legend.loc='topleft')

Of course, this equity curve is of no use, so here’s one in log scale.

As can be seen, lookahead bias makes a massive difference.

Here are the numerical results:

                            S&P 500  Lookahead   Correct
Annualized Return         0.0740000 0.15550000 0.0695000
Annualized Std Dev        0.1441000 0.09800000 0.1050000
Annualized Sharpe (Rf=0%) 0.5133000 1.58670000 0.6623000
Worst Drawdown            0.5255586 0.08729914 0.2699789
Calmar Ratio              0.1407286 1.78119192 0.2575219

Again, absolutely ridiculous.

Note that when using Return.Portfolio (the function in PerformanceAnalytics), that package will automatically give you the next period’s return, instead of the current one, for your weights. However, for those writing “simple” backtests that can be quickly done using vectorized operations, an off-by-one error can make all the difference between a backtest in the realm of reasonable, and pure nonsense. However, should one wish to test for said nonsense when faced with impossible-to-replicate results, the mechanics demonstrated above are the way to do it.

Now, onto other news: I’d like to thank Gerald M for staying on top of one of the Logical Invest strategies–namely, their simple global market rotation strategy outlined in an article from an earlier blog post.

Up until March 2015 (the date of the blog post), the strategy had performed well. However, after said date?

It has been a complete disaster, which, in hindsight, was evident when I passed it through the hypothesis-driven development framework process I wrote about earlier.

So, while there has been a great deal written about not simply throwing away a strategy because of short-term underperformance, and that anomalies such as momentum and value exist because of career risk due to said short-term underperformance, it’s never a good thing when a strategy creates historically large losses, particularly after being published in such a humble corner of the quantitative financial world.

In any case, this was a post demonstrating some mechanics, and an update on a strategy I blogged about not too long ago.

Thanks for reading.

NOTE: I am always interested in hearing about new opportunities which may benefit from my expertise, and am always happy to network. You can find my LinkedIn profile here.

A Book Review of ReSolve Asset Management’s Adaptive Asset Allocation

This review will review the “Adaptive Asset Allocation: Dynamic Global Portfolios to Profit in Good Times – and Bad” book by the people at ReSolve Asset Management. Overall, this book is a definite must-read for those who have never been exposed to the ideas within it. However, when it comes to a solution that can be fully replicated, this book is lacking.

Okay, it’s been a while since I reviewed my last book, DIY Financial Advisor, from the awesome people at Alpha Architect. This book in my opinion, is set up in a similar sort of format.

This is the structure of the book, and my reviews along with it:

Part 1: Why in the heck you actually need to have a diversified portfolio, and why a diversified portfolio is a good thing. In a world in which there is so much emphasis put on single-security performance, this is certainly something that absolutely must be stated for those not familiar with portfolio theory. It highlights the example of two people–one from Abbott Labs, and one from Enron, who had so much of their savings concentrated in their company’s stock. Mr. Abbott got hit hard and changed his outlook on how to save for retirement, and Mr. Enron was never heard from again. Long story short: a diversified portfolio is good, and a properly diversified portfolio can offset one asset’s zigs with another asset’s zags. This is the key to establishing a stream of returns that will help meet financial goals. Basically, this is your common sense story (humans love being told stories) so as to motivate you to read the rest of the book. It does its job, though for someone like me, it’s more akin to a big “wait for it, wait for it…and there’s the reason why we should read on, as expected”.

Part 2: Something not often brought up in many corners of the quant world (because it’s real life boring stuff) is the importance not only of average returns, but *when* those returns are achieved. Namely, imagine your everyday saver. At the beginning of their careers, they’re taking home less salary and have less money in their retirement portfolio (or speculation portfolio, but the book uses retirement portfolio). As they get into middle age and closer to retirement, they have a lot more money in said retirement portfolio. Thus, strong returns are most vital when there is more cash available *to* the portfolio, and the difference between mediocre returns at the beginning and strong returns at the end of one’s working life as opposed to vice versa is *astronomical* and cannot be understated. Furthermore, once *in* retirement, strong returns in the early years matter far more than returns in the later years once money has been withdrawn out of the portfolio (though I’d hope that a portfolio’s returns can be so strong that one can simply “live off the interest”). Or, put more intuitively: when you have $10,000 in your portfolio, a 20% drawdown doesn’t exactly hurt because you can make more money and put more into your retirement account. But when you’re 62 and have $500,000 and suddenly lose 30% of everything, well, that’s massive. How much an investor wants to avoid such a scenario cannot be understated. Warren Buffett once said that if you can’t bear to lose 50% of everything, you shouldn’t be in stocks. I really like this part of the book because it shows just how dangerous the ideas of “a 50% drawdown is unavoidable” and other “stay invested for the long haul” refrains are. Essentially, this part of the book makes a resounding statement that any financial adviser keeping his or her clients invested in equities when they’re near retirement age is doing something not very advisable, to put it lightly. In my opinion, those who advise pension funds should especially keep this section of the book in mind, since for some people, the long-term may be coming to an end, and what matters is not only steady returns, but to make sure the strategy doesn’t fall off a cliff and destroy decades of hard-earned savings.

Part 3: This part is also one that is a very important read. First off, it lays out in clear terms that the long-term forward-looking valuations for equities are at rock bottom. That is, the expected forward 15-year returns are very low, using approximately 75 years of evidence. Currently, according to the book, equity valuations imply a *negative* 15-year forward return. However, one thing I *will* take issue with is that while forward-looking long-term returns for equities may be very low, if one believed this chart and only invested in the stock market when forecast 15-year returns were above the long term average, one would have missed out on both the 2003-2007 bull runs, *and* the one since 2009 that’s just about over. So, while the book makes a strong case for caution, readers should also take the chart with a grain of salt in my opinion. However, another aspect of portfolio construction that this book covers is how to construct a robust (assets for any economic environment) and coherent (asset classes balanced in number) universe for implementation with any asset allocation algorithm. I think this bears repeating: universe selection is an extremely important topic in the discussion of asset allocation, yet there is very little discussion about it. Most research/topics simply take some “conventional universe”, such as “all stocks on the NYSE”, or “all the stocks in the S&P 500”, or “the entire set of the 50-60 most liquid futures” without consideration for robustness and coherence. This book is the first source I’ve seen that actually puts this topic under a magnifying glass besides “finger in the air pick and choose”.

Part 4: and here’s where I level my main criticism at this book. For those that have read “Adaptive Asset Allocation: A Primer”, this section of the book is basically one giant copy and paste. It’s all one large buildup to “momentum rank + min-variance optimization”. All well and good, until there’s very little detail beyond the basics as to how the minimum variance portfolio was constructed. Namely, what exactly is the minimum variance algorithm in use? Is it one of the poor variants susceptible to numerical instability inherent in inverting sample covariance matrices? Or is it a heuristic like David Varadi’s minimum variance and minimum correlation algorithm? The one feeling I absolutely could not shake was that this book had a perfect opportunity to present a robust approach to minimum variance, and instead, it’s long on concept, short on details. While the theory of “maximize return for unit risk” is all well and good, the actual algorithm to implement that theory into practice is not trivial, with the solutions taught to undergrads and master’s students having some well-known weaknesses. On top of this, one thing that got hammered into my head in the past was that ranking *also* had a weakness at the inclusion/exclusion point. E.G. if, out of ten assets, the fifth asset had a momentum of say, 10.9%, and the sixth asset had a momentum of 10.8%, how are we so sure the fifth is so much better? And while I realize that this book was ultimately meant to be a primer, in my opinion, it would have been a no-objections five-star if there were an appendix that actually went into some detail on how to go from the simple concepts and included a small numerical example of some algorithms that may address the well-known weaknesses. This doesn’t mean Greek/mathematical jargon. Just an appendix that acknowledged that not every reader is someone only picking up his first or second book about systematic investing, and that some of us are familiar with the “whys” and are more interested in the “hows”. Furthermore, I’d really love to know where the authors of this book got their data to back-date some of these ETFs into the 90s.

Part 5: some more formal research on topics already covered in the rest of the book–namely a section about how many independent bets one can take as the number of assets grow, if I remember it correctly. Long story short? You *easily* get the most bang for your buck among disparate asset classes, such as treasuries of various duration, commodities, developed vs. emerging equities, and so on, as opposed to trying to pick among stocks in the same asset class (though there’s some potential for alpha there…just…a lot less than you imagine). So in case the idea of asset class selection, not stock selection wasn’t beaten into the reader’s head before this point, this part should do the trick. The other research paper is something I briefly skimmed over which went into more depth about volatility and retirement portfolios, though I felt that the book covered this topic earlier on to a sufficient degree by building up the intuition using very understandable scenarios.

So that’s the review of the book. Overall, it’s a very solid piece of writing, and as far as establishing the *why*, it does an absolutely superb job. For those that aren’t familiar with the concepts in this book, this is definitely a must-read, and ASAP.

However, for those familiar with most of the concepts and looking for a detailed “how” procedure, this book does not deliver as much as I would have liked. And I realize that while it’s a bad idea to publish secret sauce, I bought this book in the hope of being exposed to a new algorithm presented in the understandable and intuitive language that the rest of the book was written in, and was left wanting.

Still, that by no means diminishes the impact of the rest of the book. For those who are more likely to be its target audience, it’s a 5/5. For those that wanted some specifics, it still has its gem on universe construction.

Overall, I rate it a 4/5.

Thanks for reading.