This post will be about attempting to use the Depmix package for online state prediction. While the depmix package performs admirably when it comes to describing the states of the past, when used for one-step-ahead prediction, under the assumption that tomorrow’s state will be identical to today’s, the hidden markov model process found within the package does not perform to expectations.

So, to start off, this post was motivated by Michael Halls-Moore, who recently posted some R code about using the depmixS4 library to use hidden markov models. Generally, I am loath to create posts on topics I don’t feel I have an absolutely front-to-back understanding of, but I’m doing this in the hope of learning from others on how to appropriately do online state-space prediction, or “regime switching” detection, as it may be called in more financial parlance.

Here’s Dr. Halls-Moore’s post.

While I’ve seen the usual theory of hidden markov models (that is, it can rain or it can be sunny, but you can only infer the weather judging by the clothes you see people wearing outside your window when you wake up), and have worked with toy examples in MOOCs (Udacity’s self-driving car course deals with them, if I recall correctly–or maybe it was the AI course), at the end of the day, theory is only as good as how well an implementation can work on real data.

For this experiment, I decided to take SPY data since inception, and do a full in-sample “backtest” on the data. That is, given that the HMM algorithm from depmix sees the whole history of returns, with this “god’s eye” view of the data, does the algorithm correctly classify the regimes, if the backtest results are any indication?

Here’s the code to do so, inspired by Dr. Halls-Moore’s.

require(depmixS4) require(quantmod) getSymbols('SPY', from = '1990-01-01', src='yahoo', adjust = TRUE) spyRets <- na.omit(Return.calculate(Ad(SPY))) set.seed(123) hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data=spyRets) hmmfit <- fit(hmm, verbose = FALSE) post_probs <- posterior(hmmfit) post_probs <- xts(post_probs, order.by=index(spyRets)) plot(post_probs$state) summaryMat <- data.frame(summary(hmmfit)) colnames(summaryMat) <- c("Intercept", "SD") bullState <- which(summaryMat$Intercept > 0) bearState <- which(summaryMat$Intercept < 0) hmmRets <- spyRets * lag(post_probs$state == bullState) - spyRets * lag(post_probs$state == bearState) charts.PerformanceSummary(hmmRets) table.AnnualizedReturns(hmmRets)

Essentially, while I did select three states, I noted that anything with an intercept above zero is a bull state, and below zero is a bear state, so essentially, it reduces to two states.

With the result:

table.AnnualizedReturns(hmmRets) SPY.Adjusted Annualized Return 0.1355 Annualized Std Dev 0.1434 Annualized Sharpe (Rf=0%) 0.9448

So, not particularly terrible. The algorithm works, kind of, sort of, right?

Well, let’s try online prediction now.

require(DoMC) dailyHMM <- function(data, nPoints) { subRets <- data[1:nPoints,] hmm <- depmix(SPY.Adjusted ~ 1, family = gaussian(), nstates = 3, data = subRets) hmmfit <- fit(hmm, verbose = FALSE) post_probs <- posterior(hmmfit) summaryMat <- data.frame(summary(hmmfit)) colnames(summaryMat) <- c("Intercept", "SD") bullState <- which(summaryMat$Intercept > 0) bearState <- which(summaryMat$Intercept < 0) if(last(post_probs$state) %in% bullState) { state <- xts(1, order.by=last(index(subRets))) } else if (last(post_probs$state) %in% bearState) { state <- xts(-1, order.by=last(index(subRets))) } else { state <- xts(0, order.by=last(index(subRets))) } colnames(state) <- "State" return(state) } # took 3 hours in parallel t1 <- Sys.time() set.seed(123) registerDoMC((detectCores() - 1)) states <- foreach(i = 500:nrow(spyRets), .combine=rbind) %dopar% { dailyHMM(data = spyRets, nPoints = i) } t2 <- Sys.time() print(t2-t1)

So what I did here was I took an expanding window, starting from 500 days since SPY’s inception, and kept increasing it, by one day at a time. My prediction, was, trivially enough, the most recent day, using a 1 for a bull state, and a -1 for a bear state. I ran this process in parallel (on a linux cluster, because windows’s doParallel library seems to not even know that certain packages are loaded, and it’s more messy), and the first big issue is that this process took about three hours on seven cores for about 23 years of data. Not exactly encouraging, but computing time isn’t expensive these days.

So let’s see if this process actually works.

First, let’s test if the algorithm does what it’s actually supposed to do and use one day of look-ahead bias (that is, the algorithm tells us the state at the end of the day–how correct is it even for that day?).

onlineRets <- spyRets * states charts.PerformanceSummary(onlineRets) table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets) SPY.Adjusted Annualized Return 0.2216 Annualized Std Dev 0.1934 Annualized Sharpe (Rf=0%) 1.1456

So, allegedly, the algorithm seems to do what it was designed to do, which is to classify a state for a given data set. Now, the most pertinent question: how well do these predictions do even one day ahead? You’d think that state space predictions would be parsimonious from day to day, given the long history, correct?

onlineRets <- spyRets * lag(states) charts.PerformanceSummary(onlineRets) table.AnnualizedReturns(onlineRets)

With the result:

> table.AnnualizedReturns(onlineRets) SPY.Adjusted Annualized Return 0.0172 Annualized Std Dev 0.1939 Annualized Sharpe (Rf=0%) 0.0888

That is, without the lookahead bias, the state space prediction algorithm is atrocious. Why is that?

Well, here’s the plot of the states:

In short, the online hmm algorithm in the depmix package seems to change its mind very easily, with obvious (negative) implications for actual trading strategies.

So, that wraps it up for this post. Essentially, the main message here is this: there’s a vast difference between loading doing descriptive analysis (AKA “where have you been, why did things happen”) vs. predictive analysis (that is, “if I correctly predict the future, I get a positive payoff”). In my opinion, while descriptive statistics have their purpose in terms of explaining why a strategy may have performed how it did, ultimately, we’re always looking for better prediction tools. In this case, depmix, at least in this “out-of-the-box” demonstration does not seem to be the tool for that.

If anyone has had success with using depmix (or other regime-switching algorithm in R) for prediction, I would love to see work that details the procedure taken, as it’s an area I’m looking to expand my toolbox into, but don’t have any particular good leads. Essentially, I’d like to think of this post as me describing my own experiences with the package.

Thanks for reading.

NOTE: On Oct. 5th, I will be in New York City. On Oct. 6th, I will be presenting at The Trading Show on the Programming Wars panel.

NOTE: My current analytics contract is up for review at the end of the year, so I am officially looking for other offers as well. If you have a full-time role which may benefit from the skills you see on my blog, please get in touch with me. My linkedin profile can be found here.