A Review of Gary Antonacci’s Dual Momentum Investing Book

This review is a book review of Gary Antonacci’s Dual Momentum Investing book.

The TL;DR: 4.5 out of 5 stars.

So, I honestly have very little criticism of the book beyond the fact that the book sort of insinuates as though equity momentum is the be-all-end-all of investing, which is why I deduct a fraction of a point.

Now, for the book itself: first off, unlike other quantitative trading books I’ve read (aside from Andreas Clenow’s), the book outlines a very simple to follow strategy, to the point that it has already been replicated over at AllocateSmartly. (Side note: I think Walter’s resource at Allocate Smartly is probably the single best one-stop shop for reading up on any tactical asset allocation strategy, as it’s a compendium of many strategies in the risk/return profile of the 7-15% CAGR type strategies, and even has a correlation matrix between them all.)

Regarding the rest of the content, Antonacci does a very thorough job of stepping readers through the history/rationale of momentum, and not just that, but also addressing the alternatives to his strategy.

While the “why momentum works” aspect you can find in this book and others on the subject (I.E. Alpha Architect’s Quantitative Momentum book), I do like the section on other alternative assets. Namely, the book touches upon the fact that commodities no longer trend, so a lot of CTAs are taking it on the chin, and that historically, fixed income has performed much worse from an absolute return than equities. Furthermore, smart beta isn’t (smart), and many of these factors have very low aggregate returns (if they’re significant at all, I believe Wesley Gray at Alpha Architect has a blog post stating that they aren’t). There are a fair amount of footnotes for those interested in corroborating the assertions. Suffice to say, when it comes to strategies that don’t need daily micromanagement, when it comes to how far you can get without leverage (essentially, anything outside the space of volatility trading strategies), equity momentum is about as good as you get.

Antonacci then introduces his readers to his GEM (Global Equities Momentum) strategy, which can be explained in a few sentences: at the end of each month, calculate the total 12-month return of SPY, EAFE, and BIL. If BIL has the highest return, buy AGG for that month, otherwise buy the asset with the highest return. Repeat. That’s literally it, and the performance characteristics, on a risk-adjusted basis, are superior to just about any equity fund tied down to a tiny tracking error. Essentially, the reason for that is that equity markets have bear markets, and a dual momentum strategy lets you preserve your gains instead of giving it back in market corrections (I.E. 2000-2003, 2008, etc.) while keeping pace during the good times.

Lastly, Antonacci provides some ideas for possibly improving on GEM. I may examine on these in the future. However, the low-hanging fruit for improving on this strategy, in my opinion, is to find some other strategies that diversify its drawdowns, and raise its risk-adjusted return profile. Even if the total return goes down, I believe that an interactive brokers account can offer some amount of leverage (either 50% or 100%) to boost the total returns back up, or combine a more diversified portfolio with a volatility strategy.

Lastly, the appendix includes the original dual momentum paper, and a monthly return table for GEM going back to 1974.

All in all, this book is about as accessible and comprehensive as you can get on a solid strategy that readers actually *can* implement themselves in their brokerage account of choice (please use IB or Robinhood because there’s no point paying $8-$10 per trade if you’re retail). That said, I still think that there are venues in which to travel if you’re looking to improve your overall portfolio with GEM as a foundation.

Thanks for reading.

NOTEL I am always interested in networking and hearing about full-time roles which can benefit from my skill set. My linkedin profile can be found here.

The Problem With Depmix For Online Regime Prediction

This post will be about attempting to use the Depmix package for online state prediction. While the depmix package performs admirably when it comes to describing the states of the past, when used for one-step-ahead prediction, under the assumption that tomorrow’s state will be identical to today’s, the hidden markov model process found within the package does not perform to expectations.

So, to start off, this post was motivated by Michael Halls-Moore, who recently posted some R code about using the depmixS4 library to use hidden markov models. Generally, I am loath to create posts on topics I don’t feel I have an absolutely front-to-back understanding of, but I’m doing this in the hope of learning from others on how to appropriately do online state-space prediction, or “regime switching” detection, as it may be called in more financial parlance.

Here’s Dr. Halls-Moore’s post.

While I’ve seen the usual theory of hidden markov models (that is, it can rain or it can be sunny, but you can only infer the weather judging by the clothes you see people wearing outside your window when you wake up), and have worked with toy examples in MOOCs (Udacity’s self-driving car course deals with them, if I recall correctly–or maybe it was the AI course), at the end of the day, theory is only as good as how well an implementation can work on real data.

For this experiment, I decided to take SPY data since inception, and do a full in-sample “backtest” on the data. That is, given that the HMM algorithm from depmix sees the whole history of returns, with this “god’s eye” view of the data, does the algorithm correctly classify the regimes, if the backtest results are any indication?

Here’s the code to do so, inspired by Dr. Halls-Moore’s.

require(depmixS4)
require(quantmod)
getSymbols('SPY', from = '1990-01-01', src='yahoo', adjust = TRUE)
spyRets <- na.omit(Return.calculate(Ad(SPY)))

set.seed(123)

hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data=spyRets)
hmmfit <- fit(hmm, verbose = FALSE)
post_probs <- posterior(hmmfit)
post_probs <- xts(post_probs, order.by=index(spyRets))
plot(post_probs$state)
summaryMat <- data.frame(summary(hmmfit))
colnames(summaryMat) <- c("Intercept", "SD")
bullState <- which(summaryMat$Intercept > 0)
bearState <- which(summaryMat$Intercept < 0)

hmmRets <- spyRets * lag(post_probs$state == bullState) - spyRets * lag(post_probs$state == bearState)
charts.PerformanceSummary(hmmRets)
table.AnnualizedReturns(hmmRets)

Essentially, while I did select three states, I noted that anything with an intercept above zero is a bull state, and below zero is a bear state, so essentially, it reduces to two states.

With the result:

table.AnnualizedReturns(hmmRets)
                          SPY.Adjusted
Annualized Return               0.1355
Annualized Std Dev              0.1434
Annualized Sharpe (Rf=0%)       0.9448

So, not particularly terrible. The algorithm works, kind of, sort of, right?

Well, let’s try online prediction now.

require(DoMC)

dailyHMM <- function(data, nPoints) {
  subRets <- data[1:nPoints,]
  hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data = subRets)
  hmmfit <- fit(hmm, verbose = FALSE)
  post_probs <- posterior(hmmfit)
  summaryMat <- data.frame(summary(hmmfit))
  colnames(summaryMat) <- c("Intercept", "SD")
  bullState <- which(summaryMat$Intercept > 0)
  bearState <- which(summaryMat$Intercept < 0)
  if(last(post_probs$state) %in% bullState) {
    state <- xts(1, order.by=last(index(subRets)))
  } else if (last(post_probs$state) %in% bearState) {
    state <- xts(-1, order.by=last(index(subRets)))
  } else {
    state <- xts(0, order.by=last(index(subRets)))
  }
  colnames(state) <- "State"
  return(state)
}

# took 3 hours in parallel
t1 <- Sys.time()
set.seed(123)
registerDoMC((detectCores() - 1))
states <- foreach(i = 500:nrow(spyRets), .combine=rbind) %dopar% {
  dailyHMM(data = spyRets, nPoints = i)
}
t2 <- Sys.time()
print(t2-t1)

So what I did here was I took an expanding window, starting from 500 days since SPY’s inception, and kept increasing it, by one day at a time. My prediction, was, trivially enough, the most recent day, using a 1 for a bull state, and a -1 for a bear state. I ran this process in parallel (on a linux cluster, because windows’s doParallel library seems to not even know that certain packages are loaded, and it’s more messy), and the first big issue is that this process took about three hours on seven cores for about 23 years of data. Not exactly encouraging, but computing time isn’t expensive these days.

So let’s see if this process actually works.

First, let’s test if the algorithm does what it’s actually supposed to do and use one day of look-ahead bias (that is, the algorithm tells us the state at the end of the day–how correct is it even for that day?).


onlineRets <- spyRets * states 
charts.PerformanceSummary(onlineRets)
table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets)
                          SPY.Adjusted
Annualized Return               0.2216
Annualized Std Dev              0.1934
Annualized Sharpe (Rf=0%)       1.1456

So, allegedly, the algorithm seems to do what it was designed to do, which is to classify a state for a given data set. Now, the most pertinent question: how well do these predictions do even one day ahead? You’d think that state space predictions would be parsimonious from day to day, given the long history, correct?


onlineRets <- spyRets * lag(states)
charts.PerformanceSummary(onlineRets)
table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets)
                          SPY.Adjusted
Annualized Return               0.0172
Annualized Std Dev              0.1939
Annualized Sharpe (Rf=0%)       0.0888

That is, without the lookahead bias, the state space prediction algorithm is atrocious. Why is that?

Well, here’s the plot of the states:

In short, the online hmm algorithm in the depmix package seems to change its mind very easily, with obvious (negative) implications for actual trading strategies.

So, that wraps it up for this post. Essentially, the main message here is this: there’s a vast difference between loading doing descriptive analysis (AKA “where have you been, why did things happen”) vs. predictive analysis (that is, “if I correctly predict the future, I get a positive payoff”). In my opinion, while descriptive statistics have their purpose in terms of explaining why a strategy may have performed how it did, ultimately, we’re always looking for better prediction tools. In this case, depmix, at least in this “out-of-the-box” demonstration does not seem to be the tool for that.

If anyone has had success with using depmix (or other regime-switching algorithm in R) for prediction, I would love to see work that details the procedure taken, as it’s an area I’m looking to expand my toolbox into, but don’t have any particular good leads. Essentially, I’d like to think of this post as me describing my own experiences with the package.

Thanks for reading.

NOTE: On Oct. 5th, I will be in New York City. On Oct. 6th, I will be presenting at The Trading Show on the Programming Wars panel.

NOTE: My current analytics contract is up for review at the end of the year, so I am officially looking for other offers as well. If you have a full-time role which may benefit from the skills you see on my blog, please get in touch with me. My linkedin profile can be found here.

How To Compute Turnover With Return.Portfolio in R

This post will demonstrate how to take into account turnover when dealing with returns-based data using PerformanceAnalytics and the Return.Portfolio function in R. It will demonstrate this on a basic strategy on the nine sector SPDRs.

So, first off, this is in response to a question posed by one Robert Wages on the R-SIG-Finance mailing list. While there are many individuals out there with a plethora of questions (many of which can be found to be demonstrated on this blog already), occasionally, there will be an industry veteran, a PhD statistics student from Stanford, or other very intelligent individual that will ask a question on a topic that I haven’t yet touched on this blog, which will prompt a post to demonstrate another technical aspect found in R. This is one of those times.

So, this demonstration will be about computing turnover in returns space using the PerformanceAnalytics package. Simply, outside of the PortfolioAnalytics package, PerformanceAnalytics with its Return.Portfolio function is the go-to R package for portfolio management simulations, as it can take a set of weights, a set of returns, and generate a set of portfolio returns for analysis with the rest of PerformanceAnalytics’s functions.

Again, the strategy is this: take the 9 three-letter sector SPDRs (since there are four-letter ETFs now), and at the end of every month, if the adjusted price is above its 200-day moving average, invest into it. Normalize across all invested sectors (that is, 1/9th if invested into all 9, 100% into 1 if only 1 invested into, 100% cash, denoted with a zero return vector, if no sectors are invested into). It’s a simple, toy strategy, as the strategy isn’t the point of the demonstration.

Here’s the basic setup code:

require(TTR)
require(PerformanceAnalytics)
require(quantmod)

symbols <- c("XLF", "XLK", "XLU", "XLE", "XLP", "XLF", "XLB", "XLV", "XLY")
getSymbols(symbols, src='yahoo', from = '1990-01-01-01')
prices <- list()
for(i in 1:length(symbols)) {
  tmp <- Ad(get(symbols[[i]]))
  prices[[i]] <- tmp
}
prices <- do.call(cbind, prices)

# Our signal is a simple adjusted price over 200 day SMA
signal <- prices > xts(apply(prices, 2, SMA, n = 200), order.by=index(prices))

# equal weight all assets with price above SMA200
returns <- Return.calculate(prices)
weights <- signal/(rowSums(signal)+1e-16)

# With Return.portfolio, need all weights to sum to 1
weights$zeroes <- 1 - rowSums(weights)
returns$zeroes <- 0

monthlyWeights <- na.omit(weights[endpoints(weights, on = 'months'),])
weights <- na.omit(weights)
returns <- na.omit(returns)

So, get the SPDRs, put them together, compute their returns, generate the signal, and create the zero vector, since Return.Portfolio treats weights less than 1 as a withdrawal, and weights above 1 as the addition of more capital (big FYI here).

Now, here’s how to compute turnover:

out <- Return.portfolio(R = returns, weights = monthlyWeights, verbose = TRUE)
beginWeights <- out$BOP.Weight
endWeights <- out$EOP.Weight
txns <- beginWeights - lag(endWeights)
monthlyTO <- xts(rowSums(abs(txns[,1:9])), order.by=index(txns))
plot(monthlyTO)

So, the trick is this: when you call Return.portfolio, use the verbose = TRUE option. This creates several objects, among them returns, BOP.Weight, and EOP.Weight. These stand for Beginning Of Period Weight, and End Of Period Weight.

The way that turnover is computed is simply the difference between how the day’s return moves the allocated portfolio from its previous ending point to where that portfolio actually stands at the beginning of next period. That is, the end of period weight is the beginning of period drift after taking into account the day’s drift/return for that asset. The new beginning of period weight is the end of period weight plus any transacting that would have been done. Thus, in order to find the actual transactions (or turnover), one subtracts the previous end of period weight from the beginning of period weight.

This is what such transactions look like for this strategy.

Something we can do with such data is take a one-year rolling turnover, accomplished with the following code:

yearlyTO <- runSum(monthlyTO, 252)
plot(yearlyTO, main = "running one year turnover")

It looks like this:

This essentially means that one year’s worth of two-way turnover (that is, if selling an entirely invested portfolio is 100% turnover, and buying an entirely new set of assets is another 100%, then two-way turnover is 200%) is around 800% at maximum. That may be pretty high for some people.

Now, here’s the application when you penalize transaction costs at 20 basis points per percentage point traded (that is, it costs 20 cents to transact $100).

txnCosts <- monthlyTO * -.0020
retsWithTxnCosts <- out$returns + txnCosts
compare <- na.omit(cbind(out$returns, retsWithTxnCosts))
colnames(compare) <- c("NoTxnCosts", "TxnCosts20BPs")
charts.PerformanceSummary(compare)
table.AnnualizedReturns(compare)

And the result:


                          NoTxnCosts TxnCosts20BPs
Annualized Return             0.0587        0.0489
Annualized Std Dev            0.1554        0.1553
Annualized Sharpe (Rf=0%)     0.3781        0.3149

So, at 20 basis points on transaction costs, that takes about one percent in returns per year out of this (admittedly, terrible) strategy. This is far from negligible.

So, that is how you actually compute turnover and transaction costs. In this case, the transaction cost model was very simple. However, given that Return.portfolio returns transactions at the individual asset level, one could get as complex as they would like with modeling the transaction costs.

Thanks for reading.

NOTE: I will be giving a lightning talk at R/Finance, so for those attending, you’ll be able to find me there.

Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

This post will outline an easy-to-make mistake in writing vectorized backtests–namely in using a signal obtained at the end of a period to enter (or exit) a position in that same period. The difference in results one obtains is massive.

Today, I saw two separate posts from Alpha Architect and Mike Harris both referencing a paper by Valeriy Zakamulin on the fact that some previous trend-following research by Glabadanidis was done with shoddy results, and that Glabadanidis’s results were only reproducible through instituting lookahead bias.

The following code shows how to reproduce this lookahead bias.

First, the setup of a basic moving average strategy on the S&P 500 index from as far back as Yahoo data will provide.

require(quantmod)
require(xts)
require(TTR)
require(PerformanceAnalytics)

getSymbols('^GSPC', src='yahoo', from = '1900-01-01')
monthlyGSPC <- Ad(GSPC)[endpoints(GSPC, on = 'months')]

# change this line for signal lookback
movAvg <- SMA(monthlyGSPC, 10)

signal <- monthlyGSPC > movAvg
gspcRets <- Return.calculate(monthlyGSPC)

And here is how to institute the lookahead bias.

lookahead <- signal * gspcRets
correct <- lag(signal) * gspcRets

These are the “results”:

compare <- na.omit(cbind(gspcRets, lookahead, correct))
colnames(compare) <- c("S&P 500", "Lookahead", "Correct")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
logRets <- log(cumprod(1+compare))
chart.TimeSeries(logRets, legend.loc='topleft')

Of course, this equity curve is of no use, so here’s one in log scale.

As can be seen, lookahead bias makes a massive difference.

Here are the numerical results:

                            S&P 500  Lookahead   Correct
Annualized Return         0.0740000 0.15550000 0.0695000
Annualized Std Dev        0.1441000 0.09800000 0.1050000
Annualized Sharpe (Rf=0%) 0.5133000 1.58670000 0.6623000
Worst Drawdown            0.5255586 0.08729914 0.2699789
Calmar Ratio              0.1407286 1.78119192 0.2575219

Again, absolutely ridiculous.

Note that when using Return.Portfolio (the function in PerformanceAnalytics), that package will automatically give you the next period’s return, instead of the current one, for your weights. However, for those writing “simple” backtests that can be quickly done using vectorized operations, an off-by-one error can make all the difference between a backtest in the realm of reasonable, and pure nonsense. However, should one wish to test for said nonsense when faced with impossible-to-replicate results, the mechanics demonstrated above are the way to do it.

Now, onto other news: I’d like to thank Gerald M for staying on top of one of the Logical Invest strategies–namely, their simple global market rotation strategy outlined in an article from an earlier blog post.

Up until March 2015 (the date of the blog post), the strategy had performed well. However, after said date?

It has been a complete disaster, which, in hindsight, was evident when I passed it through the hypothesis-driven development framework process I wrote about earlier.

So, while there has been a great deal written about not simply throwing away a strategy because of short-term underperformance, and that anomalies such as momentum and value exist because of career risk due to said short-term underperformance, it’s never a good thing when a strategy creates historically large losses, particularly after being published in such a humble corner of the quantitative financial world.

In any case, this was a post demonstrating some mechanics, and an update on a strategy I blogged about not too long ago.

Thanks for reading.

NOTE: I am always interested in hearing about new opportunities which may benefit from my expertise, and am always happy to network. You can find my LinkedIn profile here.

A Book Review of ReSolve Asset Management’s Adaptive Asset Allocation

This review will review the “Adaptive Asset Allocation: Dynamic Global Portfolios to Profit in Good Times – and Bad” book by the people at ReSolve Asset Management. Overall, this book is a definite must-read for those who have never been exposed to the ideas within it. However, when it comes to a solution that can be fully replicated, this book is lacking.

Okay, it’s been a while since I reviewed my last book, DIY Financial Advisor, from the awesome people at Alpha Architect. This book in my opinion, is set up in a similar sort of format.

This is the structure of the book, and my reviews along with it:

Part 1: Why in the heck you actually need to have a diversified portfolio, and why a diversified portfolio is a good thing. In a world in which there is so much emphasis put on single-security performance, this is certainly something that absolutely must be stated for those not familiar with portfolio theory. It highlights the example of two people–one from Abbott Labs, and one from Enron, who had so much of their savings concentrated in their company’s stock. Mr. Abbott got hit hard and changed his outlook on how to save for retirement, and Mr. Enron was never heard from again. Long story short: a diversified portfolio is good, and a properly diversified portfolio can offset one asset’s zigs with another asset’s zags. This is the key to establishing a stream of returns that will help meet financial goals. Basically, this is your common sense story (humans love being told stories) so as to motivate you to read the rest of the book. It does its job, though for someone like me, it’s more akin to a big “wait for it, wait for it…and there’s the reason why we should read on, as expected”.

Part 2: Something not often brought up in many corners of the quant world (because it’s real life boring stuff) is the importance not only of average returns, but *when* those returns are achieved. Namely, imagine your everyday saver. At the beginning of their careers, they’re taking home less salary and have less money in their retirement portfolio (or speculation portfolio, but the book uses retirement portfolio). As they get into middle age and closer to retirement, they have a lot more money in said retirement portfolio. Thus, strong returns are most vital when there is more cash available *to* the portfolio, and the difference between mediocre returns at the beginning and strong returns at the end of one’s working life as opposed to vice versa is *astronomical* and cannot be understated. Furthermore, once *in* retirement, strong returns in the early years matter far more than returns in the later years once money has been withdrawn out of the portfolio (though I’d hope that a portfolio’s returns can be so strong that one can simply “live off the interest”). Or, put more intuitively: when you have $10,000 in your portfolio, a 20% drawdown doesn’t exactly hurt because you can make more money and put more into your retirement account. But when you’re 62 and have $500,000 and suddenly lose 30% of everything, well, that’s massive. How much an investor wants to avoid such a scenario cannot be understated. Warren Buffett once said that if you can’t bear to lose 50% of everything, you shouldn’t be in stocks. I really like this part of the book because it shows just how dangerous the ideas of “a 50% drawdown is unavoidable” and other “stay invested for the long haul” refrains are. Essentially, this part of the book makes a resounding statement that any financial adviser keeping his or her clients invested in equities when they’re near retirement age is doing something not very advisable, to put it lightly. In my opinion, those who advise pension funds should especially keep this section of the book in mind, since for some people, the long-term may be coming to an end, and what matters is not only steady returns, but to make sure the strategy doesn’t fall off a cliff and destroy decades of hard-earned savings.

Part 3: This part is also one that is a very important read. First off, it lays out in clear terms that the long-term forward-looking valuations for equities are at rock bottom. That is, the expected forward 15-year returns are very low, using approximately 75 years of evidence. Currently, according to the book, equity valuations imply a *negative* 15-year forward return. However, one thing I *will* take issue with is that while forward-looking long-term returns for equities may be very low, if one believed this chart and only invested in the stock market when forecast 15-year returns were above the long term average, one would have missed out on both the 2003-2007 bull runs, *and* the one since 2009 that’s just about over. So, while the book makes a strong case for caution, readers should also take the chart with a grain of salt in my opinion. However, another aspect of portfolio construction that this book covers is how to construct a robust (assets for any economic environment) and coherent (asset classes balanced in number) universe for implementation with any asset allocation algorithm. I think this bears repeating: universe selection is an extremely important topic in the discussion of asset allocation, yet there is very little discussion about it. Most research/topics simply take some “conventional universe”, such as “all stocks on the NYSE”, or “all the stocks in the S&P 500”, or “the entire set of the 50-60 most liquid futures” without consideration for robustness and coherence. This book is the first source I’ve seen that actually puts this topic under a magnifying glass besides “finger in the air pick and choose”.

Part 4: and here’s where I level my main criticism at this book. For those that have read “Adaptive Asset Allocation: A Primer”, this section of the book is basically one giant copy and paste. It’s all one large buildup to “momentum rank + min-variance optimization”. All well and good, until there’s very little detail beyond the basics as to how the minimum variance portfolio was constructed. Namely, what exactly is the minimum variance algorithm in use? Is it one of the poor variants susceptible to numerical instability inherent in inverting sample covariance matrices? Or is it a heuristic like David Varadi’s minimum variance and minimum correlation algorithm? The one feeling I absolutely could not shake was that this book had a perfect opportunity to present a robust approach to minimum variance, and instead, it’s long on concept, short on details. While the theory of “maximize return for unit risk” is all well and good, the actual algorithm to implement that theory into practice is not trivial, with the solutions taught to undergrads and master’s students having some well-known weaknesses. On top of this, one thing that got hammered into my head in the past was that ranking *also* had a weakness at the inclusion/exclusion point. E.G. if, out of ten assets, the fifth asset had a momentum of say, 10.9%, and the sixth asset had a momentum of 10.8%, how are we so sure the fifth is so much better? And while I realize that this book was ultimately meant to be a primer, in my opinion, it would have been a no-objections five-star if there were an appendix that actually went into some detail on how to go from the simple concepts and included a small numerical example of some algorithms that may address the well-known weaknesses. This doesn’t mean Greek/mathematical jargon. Just an appendix that acknowledged that not every reader is someone only picking up his first or second book about systematic investing, and that some of us are familiar with the “whys” and are more interested in the “hows”. Furthermore, I’d really love to know where the authors of this book got their data to back-date some of these ETFs into the 90s.

Part 5: some more formal research on topics already covered in the rest of the book–namely a section about how many independent bets one can take as the number of assets grow, if I remember it correctly. Long story short? You *easily* get the most bang for your buck among disparate asset classes, such as treasuries of various duration, commodities, developed vs. emerging equities, and so on, as opposed to trying to pick among stocks in the same asset class (though there’s some potential for alpha there…just…a lot less than you imagine). So in case the idea of asset class selection, not stock selection wasn’t beaten into the reader’s head before this point, this part should do the trick. The other research paper is something I briefly skimmed over which went into more depth about volatility and retirement portfolios, though I felt that the book covered this topic earlier on to a sufficient degree by building up the intuition using very understandable scenarios.

So that’s the review of the book. Overall, it’s a very solid piece of writing, and as far as establishing the *why*, it does an absolutely superb job. For those that aren’t familiar with the concepts in this book, this is definitely a must-read, and ASAP.

However, for those familiar with most of the concepts and looking for a detailed “how” procedure, this book does not deliver as much as I would have liked. And I realize that while it’s a bad idea to publish secret sauce, I bought this book in the hope of being exposed to a new algorithm presented in the understandable and intuitive language that the rest of the book was written in, and was left wanting.

Still, that by no means diminishes the impact of the rest of the book. For those who are more likely to be its target audience, it’s a 5/5. For those that wanted some specifics, it still has its gem on universe construction.

Overall, I rate it a 4/5.

Thanks for reading.

A First Attempt At Applying Ensemble Filters

This post will outline a first failed attempt at applying the ensemble filter methodology to try and come up with a weighting process on SPY that should theoretically be a gradual process to shift from conviction between a bull market, a bear market, and anywhere in between. This is a follow-up post to this blog post.

So, my thinking went like this: in a bull market, as one transitions from responsiveness to smoothness, responsive filters should be higher than smooth filters, and vice versa, as there’s generally a trade-off between the two. In fact, in my particular formulation, the quantity of the square root of the EMA of squared returns punishes any deviation from a flat line altogether (although inspired by Basel’s measure of volatility, which is the square root of the 18-day EMA of squared returns), while the responsiveness quantity punishes any deviation from the time series of the realized prices. Whether these are the two best measures of smoothness and responsiveness is a topic I’d certainly appreciate feedback on.

In any case, an idea I had on the top of my head was that in addition to having a way of weighing multiple filters by their responsiveness (deviation from price action) and smoothness (deviation from a flat line), that by taking the sums of the sign of the difference between one filter and its neighbor on the responsiveness to smoothness spectrum, provided enough ensemble filters (say, 101, so there are 100 differences), one would obtain a way to move from full conviction of a bull market, to a bear market, to anything in between, and have this be a smooth process that doesn’t have schizophrenic swings of conviction.

Here’s the code to do this on SPY from inception to 2003:

require(TTR)
require(quantmod)
require(PerformanceAnalytics)

getSymbols('SPY', from = '1990-01-01')

smas <- list()
for(i in 2:250) {
  smas[[i]] <- SMA(Ad(SPY), n = i)
}
smas <- do.call(cbind, smas)

xtsApply <- function(x, FUN, n, ...) {
  out <- xts(apply(x, 2, FUN, n = n, ...), order.by=index(x))
  return(out)
}

sumIsNa <- function(x){
  return(sum(is.na(x)))
}

ensembleFilter <- function(data, filters, n = 20, conviction = 1, emphasisSmooth = .51) {
  
  # smoothness error
  filtRets <- Return.calculate(filters)
  sqFiltRets <- filtRets * filtRets * 100 #multiply by 100 to prevent instability
  smoothnessError <- sqrt(xtsApply(sqFiltRets, EMA, n = n))
  
  # responsiveness error
  repX <- xts(matrix(data, nrow = nrow(filters), ncol=ncol(filters)), 
              order.by = index(filters))
  dataFilterReturns <- repX/filters - 1
  sqDataFilterQuotient <- dataFilterReturns * dataFilterReturns * 100 #multiply by 100 to prevent instability
  responseError <- sqrt(xtsApply(sqDataFilterQuotient, EMA, n = n))
  
  # place smoothness and responsiveness errors on same notional quantities
  meanSmoothError <- rowMeans(smoothnessError)
  meanResponseError <- rowMeans(responseError)
  ratio <- meanSmoothError/meanResponseError
  ratio <- xts(matrix(ratio, nrow=nrow(filters), ncol=ncol(filters)),
               order.by=index(filters))
  responseError <- responseError * ratio
  
  # for each term in emphasisSmooth, create a separate filter
  ensembleFilters <- list()
  for(term in emphasisSmooth) {
    
    # compute total errors, raise them to a conviction power, find the normalized inverse
    totalError <- smoothnessError * term + responseError * (1-term)
    totalError <- totalError ^ conviction
    invTotalError <- 1/totalError
    normInvError <- invTotalError/rowSums(invTotalError)
    
    # ensemble filter is the sum of candidate filters in proportion
    # to the inverse of their total error
    tmp <- xts(rowSums(filters * normInvError), order.by=index(data))
    
    #NA out time in which one or more filters were NA
    initialNAs <- apply(filters, 1, sumIsNa) 
    tmp[initialNAs > 0] <- NA
    tmpName <- paste("emphasisSmooth", term, sep="_")
    colnames(tmp) <- tmpName
    ensembleFilters[[tmpName]] <- tmp
  }
  
  # compile the filters
  out <- do.call(cbind, ensembleFilters)
  return(out)
}

t1 <- Sys.time()
filts <- ensembleFilter(Ad(SPY), smas, n = 20, conviction = 2, emphasisSmooth = seq(0, 1, by=.01))
t2 <- Sys.time()

par(mfrow=c(3,1))
filtDiffs <- sign(filts[,1:100] - filts[,2:101])
sumDiffs <- xts(rowSums(filtDiffs), order.by=index(filtDiffs))

plot(Ad(SPY)["::2003"])
plot(sumDiffs["::2003"])
plot(diff(sumDiffs["::2003"]))

And here’s the very underwhelming result:

Essentially, while I expected to see changes in conviction of maybe 20 at most, instead, my indicator of sum of sign differences did exactly as I had hoped it wouldn’t, which is to be a very binary sort of mechanic. My intuition was that between an “obvious bull market” and an “obvious bear market” that some differences would be positive, some negative, and that they’d net each other out, and the conviction would be zero. Furthermore, that while any individual crossover is binary, all one hundred signs being either positive or negative would be a more gradual process. Apparently, this was not the case. To continue this train of thought later, one thing to try would be an all-pairs sign difference. Certainly, I don’t feel like giving up on this idea at this point, and, as usual, feedback would always be appreciated.

Thanks for reading.

NOTE: I am currently consulting in an analytics capacity in downtown Chicago. However, I am also looking for collaborators that wish to pursue interesting trading ideas. If you feel my skills may be of help to you, let’s talk. You can email me at ilya.kipnis@gmail.com, or find me on my LinkedIn here.

How well can you scale your strategy?

This post will deal with a quick, finger in the air way of seeing how well a strategy scales–namely, how sensitive it is to latency between signal and execution, using a simple volatility trading strategy as an example. The signal will be the VIX/VXV ratio trading VXX and XIV, an idea I got from Volatility Made Simple’s amazing blog, particularly this post. The three signals compared will be the “magical thinking” signal (observe the close, buy the close, named from the ruleOrderProc setting in quantstrat), buy on next-day open, and buy on next-day close.

Let’s get started.

require(downloader)
require(PerformanceAnalytics)
require(IKTrading)
require(TTR)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", 
         destfile="vxvData.csv")
download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")
download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package
getSymbols('^VIX', from = '1990-01-01')


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vixVxv <- Cl(VIX)/Cl(vxv)


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))

vxxCloseRets <- Return.calculate(Cl(vxx))
vxxOpenRets <- Return.calculate(Op(vxx))
xivCloseRets <- Return.calculate(Cl(xiv))
xivOpenRets <- Return.calculate(Op(xiv))

vxxSig <- vixVxv > 1
xivSig <- 1-vxxSig

magicThinking <- vxxCloseRets * lag(vxxSig) + xivCloseRets * lag(xivSig)
nextOpen <- vxxOpenRets * lag(vxxSig, 2) + xivOpenRets * lag(xivSig, 2)
nextClose <- vxxCloseRets * lag(vxxSig, 2) + xivCloseRets * lag(xivSig, 2)
tradeWholeDay <- (nextOpen + nextClose)/2

compare <- na.omit(cbind(magicThinking, nextOpen, nextClose, tradeWholeDay))
colnames(compare) <- c("Magic Thinking", "Next Open", 
                       "Next Close", "Execute Through Next Day")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), 
      maxDrawdown(compare), CalmarRatio(compare))

par(mfrow=c(1,1))
chart.TimeSeries(log(cumprod(1+compare), base = 10), legend.loc='topleft', ylab='log base 10 of additional equity',
                 main = 'VIX vx. VXV different execution times')

So here’s the run-through. In addition to the magical thinking strategy (observe the close, buy that same close), I tested three other variants–a variant which transacts the next open, a variant which transacts the next close, and the average of those two. Effectively, I feel these three could give a sense of a strategy’s performance under more realistic conditions–that is, how well does the strategy perform if transacted throughout the day, assuming you’re managing a sum of money too large to just plow into the market in the closing minutes (and if you hope to get rich off of trading, you will have a larger sum of money than the amount you can apply magical thinking to). Ideally, I’d use VWAP pricing, but as that’s not available for free anywhere I know of, that means that readers can’t replicate it even if I had such data.

In any case, here are the results.

Equity curves:

Log scale (for Mr. Tony Cooper and others):

Stats:

                          Magic Thinking Next Open Next Close Execute Through Next Day
Annualized Return               0.814100 0.8922000  0.5932000                 0.821900
Annualized Std Dev              0.622800 0.6533000  0.6226000                 0.558100
Annualized Sharpe (Rf=0%)       1.307100 1.3656000  0.9529000                 1.472600
Worst Drawdown                  0.566122 0.5635336  0.6442294                 0.601014
Calmar Ratio                    1.437989 1.5831686  0.9208586                 1.367510

My reaction? The execute on next day’s close performance being vastly lower than the other configurations (and that deterioration occurring in the most recent years) essentially means that the fills will have to come pretty quickly at the beginning of the day. While the strategy seems somewhat scalable through the lens of this finger-in-the-air technique, in my opinion, if the first full day of possible execution after signal reception will tank a strategy from a 1.44 Calmar to a .92, that’s a massive drop-off, after holding everything else constant. In my opinion, I think this is quite a valid question to ask anyone who simply sells signals, as opposed to manages assets. Namely, how sensitive are the signals to execution on the next day? After all, unless those signals come at 3:55 PM, one is most likely going to be getting filled the next day.

Now, while this strategy is a bit of a tomato can in terms of how good volatility trading strategies can get (they can get a *lot* better in my opinion), I think it made for a simple little demonstration of this technique. Again, a huge thank you to Mr. Helmuth Vollmeier for so kindly keeping up his dropbox all this time for the volatility data!

Thanks for reading.

NOTE: I am currently contracting in a data science capacity in Chicago. You can email me at ilya.kipnis@gmail.com, or find me on my LinkedIn here. I’m always open to beers after work if you’re in the Chicago area.

NOTE 2: Today, on October 21, 2015, if you’re in Chicago, there’s a Chicago R Users Group conference at Jaks Tap at 6:00 PM. Free pizza, networking, and R, hosted by Paul Teetor, who’s a finance guy. Hope to see you there.