The Downside of Rankings-Based Strategies

This post will demonstrate a downside to rankings-based strategies, particularly when using data of a questionable quality (which, unless one pays multiple thousands of dollars per month for data, most likely is of questionable quality). Essentially, by making one small change to the way the strategy filters, it introduces a massive performance drop in terms of drawdown. This exercise effectively demonstrates a different possible way of throwing a curve-ball at ranking strategies to test for robustness.

So, before starting this post, I have a data request to make:

Does anyone have a working full history of the VEQTOR index? Here’s its prospectus. If so, I’d be grateful to download its full history. The mechanics of it aren’t particularly complicated, but as it uses futures data, I can’t rebuild it myself, and would be exceptionally grateful if someone could construct this.

In any case, recently, a discussion came up between myself, Terry Doherty, Cliff Smith, and some others on Seeking Alpha regarding what happened when I substituted the 63-day SMA for the three month SMA in Cliff Smith’s QTS strategy (quarterly tactical strategy…strategy).

Essentially, by simply substituting a 63-day SMA (that is, using daily data instead of monthly) for a 3-month SMA, the results were drastically affected.

Here’s the new QTS code, now in a function.

qts <- function(prices, nShort = 20, nLong = 105, nMonthSMA = 3, nDaySMA = 63, wRankShort=1, wRankLong=1.01, 
                movAvgType = c("monthly", "daily"), cashAsset="VUSTX", returnNames = FALSE) {
  cashCol <- grep(cashAsset, colnames(prices))
  
  #start our data off on the security with the least data (VGSIX in this case)
  prices <- prices[!is.na(prices[,7]),] 
  
  #cash is not a formal asset in our ranking
  cashPrices <- prices[, cashCol]
  prices <- prices[, -cashCol]
  
  #compute momentums
  rocShort <- prices/lag(prices, nShort) - 1
  rocLong <- prices/lag(prices, nLong) - 1
  
  #take the endpoints of quarter start/end
  quarterlyEps <- endpoints(prices, on="quarters")
  monthlyEps <- endpoints(prices, on = "months")
  
  #take the prices at quarterly endpoints
  quarterlyPrices <- prices[quarterlyEps,]
  
  #short momentum at quarterly endpoints (20 day)
  rocShortQtrs <- rocShort[quarterlyEps,]
  
  #long momentum at quarterly endpoints (105 day)
  rocLongQtrs <- rocLong[quarterlyEps,]
  
  #rank short momentum, best highest rank
  rocSrank <- t(apply(rocShortQtrs, 1, rank))
  
  #rank long momentum, best highest rank
  rocLrank <- t(apply(rocLongQtrs, 1, rank))
  
  #total rank, long slightly higher than short, sum them
  totalRank <- wRankLong * rocLrank + wRankShort * rocSrank 
  
  #function that takes 100% position in highest ranked security
  maxRank <- function(rankRow) {
    return(rankRow==max(rankRow))
  }
  
  #apply above function to our quarterly ranks every quarter
  rankPos <- t(apply(totalRank, 1, maxRank))
  
  #SMA of securities, only use monthly endpoints
  #subset to quarters
  #then filter
  movAvgType = movAvgType[1]
  if(movAvgType=="monthly") {
    monthlyPrices <- prices[monthlyEps,]
    monthlySMAs <- xts(apply(monthlyPrices, 2, SMA, n=nMonthSMA), order.by=index(monthlyPrices))
    quarterlySMAs <- monthlySMAs[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else if (movAvgType=="daily") {
    smas <- xts(apply(prices, 2, SMA, n=nDaySMA), order.by=index(prices))
    quarterlySMAs <- smas[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else {
    stop("invalid moving average type")
  }
  
  finalPos <- rankPos*smaFilter
  finalPos <- finalPos[!is.na(rocLongQtrs[,1]),]
  cash <- xts(1-rowSums(finalPos), order.by=index(finalPos))
  finalPos <- merge(finalPos, cash, join='inner')
  
  prices <- merge(prices, cashPrices, join='inner')
  returns <- Return.calculate(prices)
  stratRets <- Return.portfolio(returns, finalPos)
  
  if(returnNames) {
    findNames <- function(pos) {
      return(names(pos[pos==1]))
    }
    tmp <- apply(finalPos, 1, findNames)
    assetNames <- xts(tmp, order.by=as.Date(names(tmp)))
    return(list(assetNames, stratRets))
  }
  return(stratRets)
}

The one change I made is this:

  movAvgType = movAvgType[1]
  if(movAvgType=="monthly") {
    monthlyPrices <- prices[monthlyEps,]
    monthlySMAs <- xts(apply(monthlyPrices, 2, SMA, n=nMonthSMA), order.by=index(monthlyPrices))
    quarterlySMAs <- monthlySMAs[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else if (movAvgType=="daily") {
    smas <- xts(apply(prices, 2, SMA, n=nDaySMA), order.by=index(prices))
    quarterlySMAs <- smas[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else {
    stop("invalid moving average type")
  }

In essence, it allows the function to use either a monthly-calculated moving average, or a daily, which is then subset to the quarterly frequency of the rest of the data.

(I also allow the function to return the names of the selected securities.)

So now we can do two tests:

1) The initial parameter settings (20-day short-term momentum, 105-day long-term momentum, equal weigh their ranks (tiebreaker to the long-term), and use a 3-month SMA to filter)
2) The same exact parameter settings, except a 63-day SMA for the filter.

Here’s the code to do that.

#get our data from yahoo, use adjusted prices
symbols <- c("NAESX", #small cap
             "PREMX", #emerging bond
             "VEIEX", #emerging markets
             "VFICX", #intermediate investment grade
             "VFIIX", #GNMA mortgage
             "VFINX", #S&P 500 index
             "VGSIX", #MSCI REIT
             "VGTSX", #total intl stock idx
             "VUSTX") #long term treasury (cash)

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))

monthlySMAqts <- qts(prices, returnNames=TRUE)
dailySMAqts <- qts(prices, wRankShort=.95, wRankLong=1.05, movAvgType = "daily", returnNames=TRUE)

retsComparison <- cbind(monthlySMAqts[[2]], dailySMAqts[[2]])
colnames(retsComparison) <- c("monthly SMA qts", "daily SMA qts")
retsComparison <- retsComparison["2003::"]
charts.PerformanceSummary(retsComparison["2003::"])
rbind(table.AnnualizedReturns(retsComparison["2003::"]), maxDrawdown(retsComparison["2003::"]))

And here are the results:

Statistics:

                          monthly SMA qts daily SMA qts
Annualized Return               0.2745000     0.2114000
Annualized Std Dev              0.1725000     0.1914000
Annualized Sharpe (Rf=0%)       1.5915000     1.1043000
Worst Drawdown                  0.1911616     0.3328411

With the corresponding equity curves:

Here are the several instances in which the selections do not match thanks to the filters:

selectedNames <- cbind(monthlySMAqts[[1]], dailySMAqts[[1]])
colnames(selectedNames) <- c("Monthly SMA Filter", "Daily SMA Filter")
differentSelections <- selectedNames[selectedNames[,1]!=selectedNames[,2],]

With the results:

           Monthly SMA Filter Daily SMA Filter
1997-03-31 "VGSIX"            "cash"          
2007-12-31 "cash"             "PREMX"         
2008-06-30 "cash"             "VFIIX"         
2008-12-31 "cash"             "NAESX"         
2011-06-30 "cash"             "NAESX"  

Now, of course, many can make the arguments that Yahoo’s data is junk, my backtest doesn’t reflect reality, etc., which would essentially miss the point: this data here, while not a perfect realization of the reality of Planet Earth, may as well have been valid (you know, like all the academics, who use various simulation techniques to synthesize more data or explore other scenarios?). All I did here was change the filter to something logically comparable (that is, computing the moving average filter on a different time-scale, which does not in any way change the investment logic). From 2003 onward, this change only affected the strategy in four places. However, those instances were enough to create some noticeable changes (for the worse) in the strategy’s performance. Essentially, the downside of rankings-based strategies are when the overall number of selected instruments (in this case, ONE!) is small, a few small changes in parameters, data, etc. can lead to drastically different results.

As I write this, Cliff Smith already has ideas as to how to counteract this phenomenon. However, unto my experience, once a strategy starts getting into “how do we smooth out that one bump on the equity curve” territory, I think it’s time to go back and re-examine the strategy altogether. In my opinion, while the idea of momentum is of course, sound, with a great deal of literature devoted to it, the idea of selecting just one instrument at a time as the be-all-end-all strategy does not sit well with me. However, to me, QTS nevertheless presents an interesting framework for analyzing small subgroups of securities, and using it as one layer of an overarching strategy framework, such that the return streams are sub-strategies, instead of raw instruments.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Logical-Invest “Universal Investment Strategy”–A Walk Forward Process on SPY and TLT

I’m sure we’ve all heard about diversified stock and bond portfolios. In its simplest, most diluted form, it can be comprised of the SPY and TLT etfs. The concept introduced by Logical Invest, in a Seeking Alpha article written by Frank Grossman (also see link here), essentially uses a walk-forward methodology of maximizing a modified Sharpe ratio, biased heavily in favor of the volatility rather than the returns. That is, it uses a 72-day moving window to maximize total returns between different weighting configurations of a SPY-TLT mix over the standard deviation raised to the power of 5/2. To put it into perspective, at a power of 1, this is the basic Sharpe ratio, and at a power of 0, just a momentum maximization algorithm.

The process for this strategy is simple: rebalance every month on some multiple of 5% between SPY and TLT that previously maximized the following quantity (returns/vol^2.5 on a 72-day window).

Here’s the code for obtaining the data and computing the necessary quantities:

require(quantmod)
require(PerformanceAnalytics)
getSymbols(c("SPY", "TLT"), from="1990-01-01")
returns <- merge(Return.calculate(Ad(SPY)), Return.calculate(Ad(TLT)), join='inner')
returns <- returns[-1,]
configs <- list()
for(i in 1:21) {
  weightSPY <- (i-1)*.05
  weightTLT <- 1-weightSPY
  config <- Return.portfolio(R = returns, weights=c(weightSPY, weightTLT), rebalance_on = "months")
  configs[[i]] <- config
}
configs <- do.call(cbind, configs)
cumRets <- cumprod(1+configs)
period <- 72

roll72CumAnn <- (cumRets/lag(cumRets, period))^(252/period) - 1
roll72SD <- sapply(X = configs, runSD, n=period)*sqrt(252)

Next, the code for creating the weights:

sd_f_factor <- 2.5
modSharpe <- roll72CumAnn/roll72SD^sd_f_factor
monthlyModSharpe <- modSharpe[endpoints(modSharpe, on="months"),]

findMax <- function(data) {
  return(data==max(data))
}

weights <- t(apply(monthlyModSharpe, 1, findMax))
weights <- weights*1
weights <- xts(weights, order.by=as.Date(rownames(weights)))
weights[is.na(weights)] <- 0
weights$zeroes <- 1-rowSums(weights)
configs$zeroes <- 0

That is, simply take the setting that maximizes the monthly modified Sharpe Ratio calculation at each rebalancing date (the end of every month).

Next, here’s the performance:

stratRets <- Return.portfolio(R = configs, weights = weights)
rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
charts.PerformanceSummary(stratRets)

Which gives the results:

> rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
                          portfolio.returns
Annualized Return                 0.1317000
Annualized Std Dev                0.0990000
Annualized Sharpe (Rf=0%)         1.3297000
Worst Drawdown                    0.1683851

With the following equity curve:

Not perfect, but how does it compare to the ingredients?

Let’s take a look:

stratAndComponents <- merge(returns, stratRets, join='inner')
charts.PerformanceSummary(stratAndComponents)
rbind(table.AnnualizedReturns(stratAndComponents), maxDrawdown(stratAndComponents))
apply.yearly(stratAndComponents, Return.cumulative)

Here are the usual statistics:

> rbind(table.AnnualizedReturns(stratAndComponents), maxDrawdown(stratAndComponents))
                          SPY.Adjusted TLT.Adjusted portfolio.returns
Annualized Return            0.0907000    0.0783000         0.1317000
Annualized Std Dev           0.1981000    0.1381000         0.0990000
Annualized Sharpe (Rf=0%)    0.4579000    0.5669000         1.3297000
Worst Drawdown               0.5518552    0.2659029         0.1683851

In short, it seems the strategy performs far better than either of the ingredients. Let’s see if the equity curve comparison reflects this.

Indeed, it does. While it does indeed have the drawdown in the crisis, both instruments were in drawdown at the time, so it appears that the strategy made the best of a bad situation.

Here are the annual returns:

> apply.yearly(stratAndComponents, Return.cumulative)
           SPY.Adjusted TLT.Adjusted portfolio.returns
2002-12-31  -0.02054891  0.110907611        0.01131366
2003-12-31   0.28179336  0.015936985        0.12566042
2004-12-31   0.10695067  0.087089794        0.09724221
2005-12-30   0.04830869  0.085918063        0.10525398
2006-12-29   0.15843880  0.007178861        0.05294557
2007-12-31   0.05145526  0.102972399        0.06230742
2008-12-31  -0.36794099  0.339612265        0.19590423
2009-12-31   0.26352114 -0.218105306        0.18826736
2010-12-31   0.15056113  0.090181150        0.16436950
2011-12-30   0.01890375  0.339915713        0.24562838
2012-12-31   0.15994578  0.024083393        0.06051237
2013-12-31   0.32303535 -0.133818884        0.13760060
2014-12-31   0.13463980  0.273123290        0.19637382
2015-02-20   0.02773183  0.006922893        0.02788726

2002 was an incomplete year. However, what’s interesting here is that on a whole, while the strategy rarely if ever does as well as the better of the two instruments, it always outperforms the worse of the two instruments–and not only that, but it has delivered a positive performance in every year of the backtest–even when one instrument or the other was taking serious blows to performance, such as SPY in 2008, and TLT in 2009 and 2013.

For the record, here is the weight of SPY in the strategy.

weightSPY <- apply(monthlyModSharpe, 1, which.max)
weightSPY <- do.call(rbind, weightSPY)
weightSPY <- (weightSPY-1)*.05
align <- cbind(weightSPY, stratRets)
align <- na.locf(align)
chart.TimeSeries(align[,1], date.format="%Y", ylab="Weight SPY", main="Weight of SPY in SPY-TLT pair")

Now while this may serve as a standalone strategy for some people, the takeaway in my opinion from this is that dynamically re-weighting two return streams that share a negative correlation can lead to some very strong results compared to the ingredients from which they were formed. Furthermore, rather than simply rely on one number to summarize a relationship between two instruments, the approach that Frank Grossman took to actually model the combined returns was one I find interesting, and undoubtedly has applications as a general walk-forward process.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

A Closer Update To David Varadi’s Percentile Channels Strategy

So thanks to seeing Michael Kapler’s implementation of David Varadi’s percentile channels strategy, I was able to get a better understanding of what was going on. It turns out that rather than looking at the channel value only at the ends of months, that the strategy actually keeps track of the channel’s value intra-month. So if in the middle of the month, you had a sell signal and at the end of the month, the price moved up to intra-channel values, you would still be on a sell signal rather than the previous month’s end-of-month signal. It’s not much different than my previous implementation when all is said and done (slightly higher Sharpe, slightly lower returns and drawdowns). In any case, the concept remains the same.

For this implementation, I’m going to use the runquantile function from the caTools package, which contains a function called runquantile that works like a generalized runMedian/runMin/runMax from TTR, once you’re able to give it the proper arguments (on default, its results are questionable).

Here’s the code:

require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- cbind(Ad(LQD), Ad(DBC), Ad(VTI), Ad(ICF), Ad(SHY))
prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]

require(caTools)
pctChannelPosition <- function(prices,
                               dayLookback = 60, 
                               lowerPct = .25, upperPct = .75) {
  leadingNAs <- matrix(nrow=dayLookback-1, ncol=ncol(prices), NA)
  
  upperChannels <- runquantile(prices, k=dayLookback, probs=upperPct, endrule="trim")
  upperQ <- xts(rbind(leadingNAs, upperChannels), order.by=index(prices))
  
  lowerChannels <- runquantile(prices, k=dayLookback, probs=lowerPct, endrule="trim")
  lowerQ <- xts(rbind(leadingNAs, lowerChannels), order.by=index(prices))
  
  positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
  positions[prices > upperQ & lag(prices) < upperQ] <- 1 #cross up
  positions[prices < lowerQ & lag(prices) > lowerQ] <- -1 #cross down
  positions <- na.locf(positions)
  positions[is.na(positions)] <- 0
  
  colnames(positions) <- colnames(prices)
  return(positions)
}

#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4

compositeMonths <- compositePosition[endpoints(compositePosition, on="months"),]

returns <- Return.calculate(prices)
monthlySD20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(prices))[index(compositeMonths),]
weight <- compositeMonths*1/monthlySD20
weight <- abs(weight)/rowSums(abs(weight))
weight[compositeMonths < 0 | is.na(weight)] <- 0
weight$CASH <- 1-rowSums(weight)

#not actually equal weight--more like composite weight, going with 
#Michael Kapler's terminology here
ewWeight <- abs(compositeMonths)/rowSums(abs(compositeMonths))
ewWeight[compositeMonths < 0 | is.na(ewWeight)] <- 0
ewWeight$CASH <- 1-rowSums(ewWeight)

rpRets <- Return.portfolio(R = returns, weights = weight)
ewRets <- Return.portfolio(R = returns, weights = ewWeight)

Essentially, with runquantile, you need to give it the “trim” argument, and then manually append the leading NAs, and then manually turn it into an xts object, which is annoying. One would think that the author of this package would take care of these quality-of-life issues, but no. In any case, there are two strategies at play here–one being the percentile channel risk parity strategy, and the other what Michael Kapler calls “channel equal weight”, which actually *isn’t* an equal weight strategy, since the composite parameter values may take the values (-1, -.5, 0, .5, and 1–with a possibility for .75 or .25 early on when some of the lookback channels still say 0 instead of only 1 or -1), but simply, the weights without taking into account volatility at all, but I’m sticking with Michael Kapler’s terminology to be consistent. That said, I don’t personally use Michael Kapler’s SIT package due to the vast differences in syntax between it and the usual R code I’m used to. However, your mileage may vary.

In any case, here’s the updated performance:

both <- cbind(rpRets, ewRets)
colnames(both) <- c("RiskParity", "Equal Weight")
charts.PerformanceSummary(both)
rbind(table.AnnualizedReturns(both), maxDrawdown(both))
apply.yearly(both, Return.cumulative)

Which gives us the following output:

> rbind(table.AnnualizedReturns(both), maxDrawdown(both))
                          RiskParity Equal Weight
Annualized Return         0.09380000    0.1021000
Annualized Std Dev        0.06320000    0.0851000
Annualized Sharpe (Rf=0%) 1.48430000    1.1989000
Worst Drawdown            0.06894391    0.1150246

> apply.yearly(both, Return.cumulative)
           RiskParity Equal Weight
2006-12-29 0.08352255   0.07678321
2007-12-31 0.05412147   0.06475540
2008-12-31 0.10663085   0.12212063
2009-12-31 0.11920721   0.19093131
2010-12-31 0.13756460   0.14594317
2011-12-30 0.11744706   0.08707801
2012-12-31 0.07730896   0.06085295
2013-12-31 0.06733187   0.08174173
2014-12-31 0.06435030   0.07357458
2015-02-17 0.01428705   0.01568372

In short, the more naive weighting scheme delivers slightly higher returns but pays dearly for those marginal returns with downside risk.

Here are the equity curves:

So, there you have it. The results David Varadi obtained are legitimate. But nevertheless, I hope this demonstrates how easy it is for the small details to make material differences.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

An Attempt At Replicating David Varadi’s Percentile Channels Strategy

This post will detail an attempt at replicating David Varadi’s percentile channels strategy. As I’m only able to obtain data back to mid 2006, the exact statistics will not be identical. However, of the performance I do have, it is similar (but not identical) to the corresponding performance presented by David Varadi.

First off, before beginning this post, I’d like to issue a small mea culpa regarding the last post. It turns out that Yahoo’s data, once it gets into single digit dollar prices, is of questionable accuracy, and thus, results from the late 90s on mutual funds with prices falling into those ranges are questionable, as a result. As I am an independent blogger, and also make it a policy of readers being able to replicate all of my analysis, I am constrained by free data sources, and sometimes, the questionable quality of that data may materially affect results. So, if it’s one of your strategies replicated on this blog, and you find contention with my results, I would be more than happy to work with the data used to generate the original results, corroborate the results, and be certain that any differences in results from using lower-quality, publicly-available data stem from that alone. Generally, I find it surprising that a company as large as Yahoo can have such gaping data quality issues in certain aspects, but I’m happy that I was able to replicate the general thrust of QTS very closely.

This replication of David Varadi’s strategy, however, is not one such case–mainly because the data for DBC does not extend back very far (it was in inception only in 2006, and the data used by David Varadi’s programmer was obtained from Bloomberg, which I have no access to), and furthermore, I’m not certain if my methods are absolutely identical. Nevertheless, the strategy in and of itself is solid.

The way the strategy works is like this (to my interpretation of David Varadi’s post and communication with his other programmer). Given four securities (LQD, DBC, VTI, ICF), and a cash security (SHY), do the following:

Find the running the n-day quantile of an upper and lower percentile. Anything above the upper percentile gets a score of 1, anything lower gets a score of -1. Leave the rest as NA (that is, anything between the bounds).

Subset these quantities on their monthly endpoints. Any value between channels (NA) takes the quantity of the last value. (In short, na.locf). Any initial NAs become zero.

Do this with a 60-day, 120-day, 180-day, and 252-day setting at 25th and 75th percentiles. Add these four tables up (their dimensions are the number of monthly endpoints by the number of securities) and divide by the number of parameter settings (in this case, 4 for 60, 120, 180, 252) to obtain a composite position.

Next, obtain a running 20-day standard deviation of the returns (not prices!), and subset it for the same indices as the composite positions. Take the inverse of these volatility scores, and multiply it by the composite positions to get an inverse volatility position. Take its absolute value (some positions may be negative, remember), and normalize. In the beginning, there may be some zero-across-all-assets positions, or other NAs due to lack of data (EG if a monthly endpoint occurs before enough data to compute a 20-day standard deviation, there will be a row of NAs), which will be dealt with. Keep all positions with a positive composite position (that is, scores of .5 or 1, discard all scores of zero or lower), and reinvest the remainder into the cash asset (SHY, in our case). Those are the final positions used to generate the returns.

This is how it looks like in code.

This is the code for obtaining the data (from Yahoo finance) and separating it into cash and non-cash data.

require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- cbind(Ad(LQD), Ad(DBC), Ad(VTI), Ad(ICF), Ad(SHY))
prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]

This is the function for computing the percentile channel positions for a given parameter setting. Unfortunately, it is not instantaneous due to R’s rollapply function paying a price in speed for generality. While the package caTools has a runquantile function, as of the time of this writing, I have found differences between its output and runMedian in TTR, so I’ll have to get in touch with the package’s author.

pctChannelPosition <- function(prices, rebal_on=c("months", "quarters"),
                             dayLookback = 60, 
                             lowerPct = .25, upperPct = .75) {
  
  upperQ <- rollapply(prices, width=dayLookback, quantile, probs=upperPct)
  lowerQ <- rollapply(prices, width=dayLookback, quantile, probs=lowerPct)
  positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
  positions[prices > upperQ] <- 1
  positions[prices < lowerQ] <- -1
  
  ep <- endpoints(positions, on = rebal_on[1])
  positions <- positions[ep,]
  positions <- na.locf(positions)
  positions[is.na(positions)] <- 0 
  
  colnames(positions) <- colnames(prices)
  return(positions)
}

The way this function works is simple: computes a running quantile using rollapply, and then scores anything with price above its 75th percentile as 1, and anything below the 25th percentile as -1, in accordance with David Varadi’s post.

It then subsets these quantities on months (quarters is also possible–or for that matter, other values, but the spirit of the strategy seems to be months or quarters), and imputes any NAs with the last known observation, or zero, if it is an initial NA before any position is found. Something I have found over the course of writing this and the QTS strategy is that one need not bother implementing a looping mechanism to allocate positions monthly if there isn’t a correlation matrix based on daily data involved every month, and it makes the code more readable.

Next, we find our composite position.

#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4

Next, find the running volatility for the assets, and subset them to the same time period (in this case months) as our composite position. In David Varadi’s example, the parameter is a 20-day lookback.

#find 20-day rolling standard deviations, subset them on identical indices
#to the percentile channel monthly positions
sd20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(assetPrices))
monthlySDs <- sd20[index(compositePosition)]

Next, perform the following steps: find the inverse volatility of these quantities, multiply by the composite position score, take the absolute value, and keep any position for which the composite position is greater than zero (or technically speaking, has positive signage). Due to some initial NA rows due to a lack of data (either not enough days to compute a running volatility, or no positive positions yet), those will simply be imputed to zero. Reinvest the remainder in cash.

#compute inverse volatilities
inverseVols <- 1/monthlySDs

#multiply inverse volatilities by composite positions
invVolPos <- inverseVols*compositePosition

#take absolute values of inverse volatility multiplied by position
absInvVolPos <- abs(invVolPos)

#normalize the above quantities
normalizedAbsInvVols <- absInvVolPos/rowSums(absInvVolPos)

#keep only positions with positive composite positions (remove zeroes/negative)
nonCashPos <- normalizedAbsInvVols * sign(compositePosition > 0)
nonCashPos[is.na(nonCashPos)] <- 0 #no positions before we have enough data

#add cash position which is complement of non-cash position
finalPos <- nonCashPos
finalPos$cashPos <- 1-rowSums(finalPos)

And finally, the punchline, how does this strategy perform?

#compute returns
stratRets <- Return.portfolio(R = returns, weights = finalPos)
charts.PerformanceSummary(stratRets)
stats <- rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
rownames(stats)[4] <- "Worst Drawdown"
stats

Like this:

> stats
                          portfolio.returns
Annualized Return                0.10070000
Annualized Std Dev               0.06880000
Annualized Sharpe (Rf=0%)        1.46530000
Worst Drawdown                   0.07449537

With the following equity curve:

The statistics are visibly worse than David Varadi’s 10% vs. 11.1% CAGR, 6.9% annualized standard deviation vs. 5.72%, 7.45% max drawdown vs. 5.5%, and derived statistics (EG MAR). However, my data starts far later, and 1995-1996 seemed to be phenomenal for this strategy. Here are the cumulative returns for the data I have:

> apply.yearly(stratRets, Return.cumulative)
           portfolio.returns
2006-12-29        0.11155069
2007-12-31        0.07574266
2008-12-31        0.16921233
2009-12-31        0.14600008
2010-12-31        0.12996371
2011-12-30        0.06092018
2012-12-31        0.07306617
2013-12-31        0.06303612
2014-12-31        0.05967415
2015-02-13        0.01715446

I see a major discrepancy between my returns and David’s returns in 2011, but beyond that, the results seem to be somewhere close in the pattern of yearly returns. Whether my methodology is incorrect (I think I followed the procedure to the best of my understanding, but of course, if someone sees a mistake in my code, please let me know), or whether it’s the result of using Yahoo’s questionable quality data, I am uncertain.

However, in my opinion, that doesn’t take away from the validity of the strategy as a whole. With a mid-1 Sharpe ratio on a monthly rebalancing scale, and steady new equity highs, I feel that this is a result worth sharing–even if not directly corroborated (yet, hopefully).

One last note–some of the readers on David Varadi’s blog have cried foul due to their inability to come close to his results. Since I’ve come close, I feel that the results are valid, and since I’m using different data, my results are not identical. However, if anyone has questions about my process, feel free to leave questions and/or comments.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Quarterly Tactical Strategy (aka QTS)

This post introduces the Quarterly Tactical Strategy, introduced by Cliff Smith on a Seeking Alpha article. It presents a variation on the typical dual-momentum strategy that only trades over once a quarter, yet delivers a seemingly solid risk/return profile. The article leaves off a protracted period of unimpressive performance at the turn of the millennium, however.

First off, due to the imprecision of the English language, I received some help from TrendXplorer in implementing this strategy. Those who are fans of Amibroker are highly encouraged to visit his blog.

In any case, this strategy is fairly simple:

Take a group of securities (in this case, 8 mutual funds), and do the following:

Rank a long momentum (105 days) and a short momentum (20 days), and invest in the security with the highest composite rank, with ties going to the long momentum (that is, .501*longRank + .499*shortRank, for instance). If the security with the highest composite rank is greater than its three month SMA, invest in that security, otherwise, hold cash.

There are two critical points that must be made here:

1) The three-month SMA is *not* a 63-day SMA. It is precisely a three-point SMA up to that point on the monthly endpoints of that security.
2) Unlike in flexible asset allocation or elastic asset allocation, the cash asset is not treated as a formal asset.

Let’s look at the code. Here’s the data–which are adjusted-data mutual fund data (although with a quarterly turnover, the frequent trading constraint of not trading out of the security is satisfied, though I’m not sure how dividends are treated–that is, whether a retail investor would actually realize these returns less a hopefully tiny transaction cost through their brokers–aka hopefully not too much more than $1 per transaction):

require(quantmod)
require(PerformanceAnalytics)
require(TTR)

#get our data from yahoo, use adjusted prices
symbols <- c("NAESX", #small cap
             "PREMX", #emerging bond
             "VEIEX", #emerging markets
             "VFICX", #intermediate investment grade
             "VFIIX", #GNMA mortgage
             "VFINX", #S&P 500 index
             "VGSIX", #MSCI REIT
             "VGTSX", #total intl stock idx
             "VUSTX") #long term treasury (cash)

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))

#define our cash asset and keep track of which column it is
cashAsset <- "VUSTX"
cashCol <- grep(cashAsset, colnames(prices))

#start our data off on the security with the least data (VGSIX in this case)
prices <- prices[!is.na(prices[,7]),] 

#cash is not a formal asset in our ranking
cashPrices <- prices[, cashCol]
prices <- prices[, -cashCol]

Nothing anybody hasn’t seen before up to this point. Get data, start it off at most recent inception mutual fund, separate the cash prices, moving along.

What follows is a rather rough implementation of QTS, not wrapped up in any sort of function that others can plug and play with (though I hope I made the code readable enough for others to tinker with).

Let’s define parameters and compute momentum.

#define our parameters
nShort <- 20
nLong <- 105
nMonthSMA <- 3

#compute momentums
rocShort <- prices/lag(prices, nShort) - 1
rocLong <- prices/lag(prices, nLong) - 1

Now comes some endpoints functionality (or, more colloquially, magic) that the xts library provides. It’s what allows people to get work done in R much faster than in other programming languages.

#take the endpoints of quarter start/end
quarterlyEps <- endpoints(prices, on="quarters")
monthlyEps <- endpoints(prices, on = "months")

#take the prices at quarterly endpoints
quarterlyPrices <- prices[quarterlyEps,]

#short momentum at quarterly endpoints (20 day)
rocShortQtrs <- rocShort[quarterlyEps,]

#long momentum at quarterly endpoints (105 day)
rocLongQtrs <- rocLong[quarterlyEps,]

In short, get the quarterly endpoints (and monthly, we need those for the monthly SMA which you’ll see shortly) and subset our momentum computations on those quarterly endpoints. Now let’s get the total rank for those subset-on-quarters momentum computations.

#rank short momentum, best highest rank
rocSrank <- t(apply(rocShortQtrs, 1, rank))

#rank long momentum, best highest rank
rocLrank <- t(apply(rocLongQtrs, 1, rank))

#total rank, long slightly higher than short, sum them
totalRank <- 1.01*rocLrank + rocSrank 

#function that takes 100% position in highest ranked security
maxRank <- function(rankRow) {
  return(rankRow==max(rankRow))
}

#apply above function to our quarterly ranks every quarter
rankPos <- t(apply(totalRank, 1, maxRank))

So as you can see, I rank the momentum computations by row, take a weighted sum (in slight favor of the long momentum), and then simply take the security with the highest rank at every period, giving me one 1 in every row and 0s otherwise.

Now let’s do the other end of what determines position, which is the SMA filter. In this case, we need monthly data points for our three-month SMA, and then subset it to quarters to be on the same timescale as the quarterly ranks.

#SMA of securities, only use monthly endpoints
#subset to quarters
#then filter
monthlyPrices <- prices[monthlyEps,]
monthlySMAs <- xts(apply(monthlyPrices, 2, SMA, n=nMonthSMA), order.by=index(monthlyPrices))
quarterlySMAs <- monthlySMAs[index(quarterlyPrices),]
smaFilter <- quarterlyPrices > quarterlySMAs

Now let’s put it together to get our final positions. Our cash position is simply one if we don’t have a single investment in the time period, zero else.

finalPos <- rankPos*smaFilter
finalPos <- finalPos[!is.na(rocLongQtrs[,1]),]
cash <- xts(1-rowSums(finalPos), order.by=index(finalPos))
finalPos <- merge(finalPos, cash, join='inner')

Now we can finally compute our strategy returns.

prices <- merge(prices, cashPrices, join='inner')
returns <- Return.calculate(prices)
stratRets <- Return.portfolio(returns, finalPos)
table.AnnualizedReturns(stratRets)
maxDrawdown(stratRets)
charts.PerformanceSummary(stratRets)
plot(log(cumprod(1+stratRets)))

So what do things look like?

Like this:

> table.AnnualizedReturns(stratRets)
                          portfolio.returns
Annualized Return                    0.1899
Annualized Std Dev                   0.1619
Annualized Sharpe (Rf=0%)            1.1730
> maxDrawdown(stratRets)
[1] 0.1927991

And since the first equity curve doesn’t give much of an indication in the early years, I’ll take Tony Cooper’s (of Double Digit Numerics) advice and show the log equity curve as well.

In short, from 1997 through 2002, this strategy seemed to be going nowhere, and then took off. As I was able to get this backtest going back to 1997, it makes me wonder why it was only started in 2003 for the SeekingAlpha article, since even with 1997-2002 thrown in, this strategy’s risk/reward profile still looks fairly solid. CAR about 1 (slightly less, but that’s okay, for something that turns over so infrequently, and in so few securities!), and a Sharpe higher than 1. Certainly better than what the market itself offered over the same period of time for retail investors. Perhaps Cliff Smith himself could chime in regarding his choice of time frame.

In any case, Cliff Smith marketed the strategy as having a higher than 28% CAGR, and his article was published on August 15, 2014, and started from 2003. Let’s see if we can replicate those results.

stratRets <- stratRets["2002-12-31::2014-08-15"]
table.AnnualizedReturns(stratRets)
maxDrawdown(stratRets)
charts.PerformanceSummary(stratRets)
plot(log(cumprod(1+stratRets)))

Which results in this:

> table.AnnualizedReturns(stratRets)
                          portfolio.returns
Annualized Return                    0.2862
Annualized Std Dev                   0.1734
Annualized Sharpe (Rf=0%)            1.6499
> maxDrawdown(stratRets)
[1] 0.1911616

A far improved risk/return profile without 1997-2002 (or the out-of-sample period after Cliff Smith’s publishing date). Here are the two equity curves in-sample.

In short, the results look better, and the SeekingAlpha article’s results are validated.

Now, let’s look at the out-of-sample periods on their own.

stratRets <- Return.portfolio(returns, finalPos)
earlyOOS <- stratRets["::2002-12-31"]
table.AnnualizedReturn(earlyOOS)
maxDrawdown(earlyOOS)
charts.PerformanceSummary(earlyOOS)

Here are the results:

> table.AnnualizedReturns(earlyOOS)
                          portfolio.returns
Annualized Return                    0.0321
Annualized Std Dev                   0.1378
Annualized Sharpe (Rf=0%)            0.2327
> maxDrawdown(earlyOOS)
[1] 0.1927991

And with the corresponding equity curve (which does not need a log-scale this time).

In short, it basically did nothing for an entire five years. That’s rough, and I definitely don’t like the fact that it was left off of the SeekingAlpha article, as anytime I can extend a backtest further back than a strategy’s original author and then find skeletons in the closet (as happened for each and every one of Harry Long’s strategies), it sets off red flags on this end, so I’m hoping that there’s some good explanation for leaving off 1997-2002 that I’m simply failing to mention.

Lastly, let’s look at the out-of-sample performance.

lateOOS <- stratRets["2014-08-15::"]
charts.PerformanceSummary(lateOOS)
table.AnnualizedReturns(lateOOS)
maxDrawdown(lateOOS)

With the following results:

> table.AnnualizedReturns(lateOOS)
                          portfolio.returns
Annualized Return                    0.0752
Annualized Std Dev                   0.1426
Annualized Sharpe (Rf=0%)            0.5277
> maxDrawdown(lateOOS)
[1] 0.1381713

And the following equity curve:

Basically, while it’s ugly, it made new equity highs over only two more transactions (and in such a small sample size, anything can happen), so I’ll put this one down as a small, ugly win, but a win nevertheless.

If anyone has any questions or comments about this strategy, I’d love to see them, as this is basically a first-pass replica. To Mr. Cliff Smith’s credit, the results check out, and when the worst thing one can say about a strategy is that it had a period of a flat performance (aka when the market crested at the end of the Clinton administration right before the dot-com burst), well, that’s not the worst thing in the world. Certainly, I’d call that a lot better than extending the performance to find an unreported 40% max drawdown (all of the leveraged SPY + vol ETF strategies I’ve demonstrated), or having the investment thesis demonstrating much more disappointing performance to the point of bordering on a falsehood over a longer time horizon (Zomma warthog index).

More replications (including one requested by several readers) will be upcoming.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

PELTing a Competing Changepoint Algorithm

This post will demonstrate the PELT algorithm from the changepoint package–a competing algorithm to the twitter package’s breakout detection algorithm. While neither of these algorithms produce satisfactory results, one change point location approximation algorithm that makes no distributional assumptions shows potentially promising results.

I received some feedback regarding my first foray into change point analysis from Brian Peterson. While some of it was good, a fair bit more was how I can add more to my analysis, more boxes I could check off, and so on. One of those boxes was the PELT algorithm, devised by one Rebecca Killick of Lancaster University, which I’ll give a quick run-through of below.

In the twitter paper, PELT was the competing algorithm that the paper compared itself to, and while I didn’t think that replicating the competing algorithm would be necessary at first go, it turns out, that, well, it was necessary. So, going forward, I’m going to have more points demonstrating some more aspects of these change point detection algorithms. Thus far, the most impressive one has been David Matteson’s e.divisive algorithm in his ecp package. However, its one caveat for me is its massively long running time.

Anyhow, without further ado, let’s replicate a diagram found on page 7 of the original paper in the lower right hand corner. Turns out, that is the Scribe data set that comes with the BreakoutDetection package, so we have all we need.

require(BreakoutDetection)
require(changepoint)
data(Scribe)
bd <- breakout(Scribe)
pelt <- cpt.meanvar(Scribe)
plot(Scribe, type="l")
abline(v=bd$loc, col="red")
abline(v=pelt@cpts[1], col="blue")
legend(legend = c("Breakout", "PELT"), x = 2, y = 1000, fill = c("red", "blue"))

This gives us the following diagram.

In short, the paper’s claim is corroborated. PELT underperforms even in a simple example, using both packages’ only-one-changepoint methodology. Furthermore, PELT is actually an S4-class type of object (so for those wondering what the @ character is doing, it’s the equivalent of the $ elsewhere in R).

Let’s move onto the GSPC data.

suppressMessages(require(quantmod))
suppressMessages(require(PerformanceAnalytics))

suppressMessages(getSymbols("^GSPC", from = "1984-12-25", to = "2013-05-31"))
#these two lines are only needed if Yahoo gives you holidays such as New Year's -- EG 1985-01-01
require(timeDate)
GSPC <- GSPC[!as.character(index(GSPC)) %in% as.character(holidayNYSE(1985:2013)),]

dailySqRets <- Return.calculate(Cl(GSPC))*Return.calculate(Cl(GSPC))
dailySqRets <- dailySqRets["1985::"]

Now we’ll look at a histogram of the daily squared returns and see if it looks like some sort of famous distribution.

plot(hist(dailySqRets, breaks=1000), xlim=c(0, 0.001))

Which results in the following image

So, an exponential distribution (give or take)? Well, let’s try and do some PELT changepoint analysis using the exponential distribution.

PELTcps <- cpt.meanvar(as.numeric(dailySqRets), 
                       method="PELT", test.stat="Exponential")

Did that work? No. Here’s the error message.

Error in PELT.meanvar.exp(coredata(data), pen.value) : 
  Exponential test statistic requires positive data

Now, squared returns can’t possibly be negative, because that’s just nonsensical. So what does that mean? Let’s take a look.

dailySqRets[dailySqRets==0]

And the output:

> dailySqRets[dailySqRets==0]
           GSPC.Close
1985-03-28          0
1985-10-08          0
1988-02-04          0
1988-11-02          0
1992-09-03          0
1997-01-28          0
2003-01-10          0
2008-01-03          0

So, this essentially alleges that there were some days on which the close to close didn’t move at all. Let’s take a look.

GSPC["1985-03-27::1985-04-01"]

This gives us the following output:

> GSPC["1985-03-27::1985-04-01"]
           GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
1985-03-27    178.43    179.80   178.43     179.54   101000000        179.54
1985-03-28    179.54    180.60   179.43     179.54    99780000        179.54
1985-03-29    179.54    180.66   179.54     180.66   101400000        180.66
1985-04-01    180.66    181.27   180.43     181.27    89900000        181.27

Notice that the close price for the 27th and 28th day are identical, creating a return of zero, which breaks the PELT algorithm. So let’s fix that.

#the method will throw an error with zero returns, so this deals with that
dailySqRets[dailySqRets == 0] <- dailySqRets[dailySqRets==0] + 1e-100 

Essentially, this is just to get past the error messages within the changepoint package. So now, let’s try applying the algorithm once again.

peltCps <- cpt.meanvar(as.numeric(dailySqRets), 
                       method = "PELT", test.stat = "Exponential")

This time, success–it ran! Let’s check the amount of changepoints.

length(peltCps@cpts)

Which gives:

[1] 374

…which is vastly different from the e.divisive algorithm’s output from the previous investigation, or Dr. Robert Frey’s post (20, give or take). The upside is that this algorithm works quickly, but that’s not much solace if the answers are unreliable. To further dive down into the nature of the changepoints, we will remove the last change point (which is simply the length of the data) and do some summary statistics on how long some of these complete “regimes” are (that is, drop the first and last).

newCpts <- peltCps@cpts[-length(peltCps@cpts)]
regimeTime <- diff(newCpts)
summary(regimeTime)
hist(regimeTime, breaks = 100)

…which provides the following output:

> summary(regimeTime)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00    3.00    8.00   19.14   26.00  149.00 

And the following histogram:

In short, most of these “regimes” last less than a month, and half of them don’t even make it out to two weeks. These results are not corroborated by the previously investigated methods. As more academic literature uses differences of log returns, and the point is to search for changes in the variance regime, that is the procedure that will be employed, and as the data is continuous and contains negative values, only the Normal distribution is available to choose from when using the PELT method.

logDiffs <- as.numeric(diff(log(Cl(GSPC)))["1985::"])
peltGaussCps <- cpt.var(logDiffs, method="PELT", test.stat="Normal")
length(peltGaussCps@cpts)
fullGaussRegimes <- diff(peltGaussCps@cpts[-length(peltGaussCps@cpts)])
summary(fullGaussRegimes)
hist(fullGaussRegimes, breaks = 100)

Which gives the following output:

> length(peltGaussCps@cpts)
[1] 93
> fullGaussRegimes <- diff(peltGaussCps@cpts[-length(peltGaussCps@cpts)])
> summary(fullGaussRegimes)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00   16.00   48.00   73.76  106.50  498.00 

And the following histogram:

In short, even though the result is improved, it’s still far from reliable, with most regime switches taking place within 2 months, and many of those lasting barely a few days.

Lastly, I’d like to correct an issue from the previous changepoint post in which I used the “at most one changepoint” method from the BreakoutDetection package. Now, I’ll use the multiple change point detection method.

bdcp <- breakout(dailySqRets, min.size = 20, method = "multi")

With the following results:

> length(bdcp$loc)
[1] 0

In short, the breakout algorithm found no change points whatsoever on default settings, which, once again, does not line up with the results from both the ecp package, along with Dr. Frey’s post. Even if the beta (penalty parameter) gets decreased to .0001 (from .008, its default), it still fails to find any changes in the squared returns data, which is disappointing.

However, over the course of exploring the changepoint package, I have found that there is a method that is directly analogous to Dr. Frey’s cumulative sum of squares method (that is, if you check the help file for cpt.var, one of the test.stat distributions is “CSS”, aka cumulative sum of squares). There are two methods that employ this, neither of which are PELT, but both of which predate PELT (the binary segmentation and segment neighborhood methods), and are explained in Dr. Killick’s original paper.

First off, both algorithms contain a penalty term, and the penalty term that is the default setting is the Bayesian Information Criterion (or BIC aka SIC), which is a double log of the number of data points. In contrast, the AIC is simply the log of 2, so the BIC will be greater than AIC at 8 data points or higher. The cpt.var algorithm function is mostly a wrapper for further nested wrappers, and essentially, the “cut to the chase” is that we eventually nest down to the not-exported binary segmentation variance cumulative sum of squares algorithm, and the segmentation neighborhood sum of squares algorithm.

So here’s the code for the former:

changepoint:::binseg.var.css <- function (data, Q = 5, pen = 0) 
{
  n = length(data)
  if (n < 4) {
    stop("Data must have atleast 4 observations to fit a changepoint model.")
  }
  if (Q > ((n/2) + 1)) {
    stop(paste("Q is larger than the maximum number of segments", 
               (n/2) + 1))
  }
  y2 = c(0, cumsum(data^2))
  tau = c(0, n)
  cpt = matrix(0, nrow = 2, ncol = Q)
  oldmax = Inf
  for (q in 1:Q) {
    lambda = rep(0, n - 1)
    i = 1
    st = tau[1] + 1
    end = tau[2]
    for (j in 1:(n - 1)) {
      if (j == end) {
        st = end + 1
        i = i + 1
        end = tau[i + 1]
      }
      else {
        lambda[j] = sqrt((end - st + 1)/2) * 
                         ((y2[j + 1] - y2[st])/(y2[end + 1] - y2[st]) - 
                          (j - st + 1)/(end - st + 1))
      }
    }
    k = which.max(abs(lambda))
    cpt[1, q] = k
    cpt[2, q] = min(oldmax, max(abs(lambda), na.rm = T))
    oldmax = min(oldmax, max(abs(lambda), na.rm = T))
    tau = sort(c(tau, k))
  }
  op.cps = NULL
  p = 1:(Q - 1)
  for (i in 1:length(pen)) {
    criterion = (cpt[2, ]) >= pen[i]
    if (sum(criterion) == 0) {
      op.cps = 0
    }
    else {
      op.cps = c(op.cps, max(which((criterion) == TRUE)))
    }
  }
  if (op.cps == Q) {
    warning("The number of changepoints identified is Q, 
             it is advised to increase Q to make sure 
             changepoints have not been missed.")
  }
  return(list(cps = cpt, op.cpts = op.cps, pen = pen))
}

Essentially, the algorithm loops through all the data points (that is, it defines the start as the beginning of the data, and end as the end of the data), and appends them by the quantity as defined by the lambda[j] line, which is, again:

lambda[j] = sqrt((end - st + 1)/2) * 
                 ((y2[j + 1] - y2[st])/(y2[end + 1] - y2[st]) - 
                  (j - st + 1)/(end - st + 1))

Which, to put it in financial terms, multiplies the square root of half the range by the difference of the percent B (think Bollinger Bands) of the *values* of the data and the percent B of the *indices* of the data for a given start and end location (which are consecutive change point locations). Then, the candidate change point is defined by the maximum of the absolute value of lambda between a starting and ending point, as defined by this code:

    k = which.max(abs(lambda))
    cpt[1, q] = k
    cpt[2, q] = min(oldmax, max(abs(lambda), na.rm = T))
    oldmax = min(oldmax, max(abs(lambda), na.rm = T))
    tau = sort(c(tau, k))

The algorithm then updates tau (the collection of change point locations), which updates the start and end point segment locations, and the algorithm restarts again.

Lastly, at the end, the algorithm loops through the different penalties (in this case, the BIC is simply one constant value, so there may be a special case that allows for some dynamic sort of penalty), and all the change points that have a higher lambda value than the penalty are returned as candidate change points. Again, there is no test of significance in the binary segmentation variance cumulative sum of squares algorithm — so if the penalty is manually specified to be zero, and the user specifies the maximum number of change points (n/2, where n is the length of data), then the algorithm will indeed spit back that many change points. In short, while the default settings may do a half-decent job with finding change points, it’s possible to deliberately force this algorithm to produce nonsensical output. In other words, to be glib, this algorithm isn’t attempting to win the battle against the universe in the never-ending battle to sufficiently idiot-proof something vs. the universe’s capability to create a better idiot. However, using the out-of-the-box settings should sufficiently protect oneself from having absurd results.

Here’s an illustrative example of a few iterations of a few iterations of this algorithm.

In this case, our data will be the daily returns of the GSPC from 1985 onward.

I’ll set Q (the maximum number of change points) at 30.

Let’s start this off.

data <- as.numeric(Return.calculate(Cl(GSPC))["1985::"])
Q <- 30
n <- length(data)
y2 = c(0, cumsum(data^2))
tau = c(0, n)
cpt = matrix(0, nrow = 2, ncol = Q)
oldmax = Inf

Which gives us:

> tau
[1]    0 7328
> cpt
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] 
[1,]    0    0    0    0    0    0    0    0    0    
[2,]    0    0    0    0    0    0    0    0    0  
      [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19]   
[1,]      0     0     0     0     0     0     0     0     0     0
[2,]      0     0     0     0     0     0     0     0     0     0
     [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
[1,]     0     0     0     0     0     0     0     0     0     0     0
[2,]     0     0     0     0     0     0     0     0     0     0     0

In short, tau is the location for the changepoints, and and the cpt matrix will be demonstrated shortly.

Here’s the first iteration through the main loop.

q <- 1
lambda = rep(0, n - 1)
    i = 1
    st = tau[1] + 1
    end = tau[2]
    for (j in 1:(n - 1)) {
      if (j == end) {
        st = end + 1
        i = i + 1
        end = tau[i + 1]
      }
      else {
        lambda[j] = sqrt((end - st + 1)/2) * 
                         ((y2[j + 1] - y2[st])/(y2[end + 1] - y2[st]) - 
                          (j - st + 1)/(end - st + 1))
      }
    }
    k = which.max(abs(lambda))
    cpt[1, q] = k
    cpt[2, q] = min(oldmax, max(abs(lambda), na.rm = T))
    oldmax = min(oldmax, max(abs(lambda), na.rm = T))
    tau = sort(c(tau, k))

And after this first iteration of the loop completes, here are the updated values of tau and cpt:

> tau
[1]    0 5850 7328
> cpt
           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] 
[1,] 5850.00000    0    0    0    0    0    0    0    0    
[2,]   10.51362    0    0    0    0    0    0    0    0  
      [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19]   
[1,]      0     0     0     0     0     0     0     0     0     0
[2,]      0     0     0     0     0     0     0     0     0     0
     [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
[1,]     0     0     0     0     0     0     0     0     0     0     0
[2,]     0     0     0     0     0     0     0     0     0     0     0

In short, tau is the location of the new change point (also the first row of the cpt matrix), and the second row is the absolute value of lambda. After this, the start and end vectors get appended with a new change point to allow for the binary segmentation that forms the basis of the algorithm. However, a picture’s worth a thousand words, so here are some illustrations. The blue lines denote previous change points, and the red point the new change point.

Here is the code I added in the bottom of the loop for plotting purposes:

    k = which.max(abs(lambda))
    plot(lambda, type = "l", main = paste("Changepoint", q))
    abline(v = tau, col="blue")
    cpt[1, q] = k
    cpt[2, q] = min(oldmax, max(abs(lambda), na.rm = T))
    oldmax = min(oldmax, max(abs(lambda), na.rm = T))
    tau = sort(c(tau, k))
    points(x = k, y = lambda[k], col="red", pch = 19)

And here is the result for the first change point:

The second change point:

The 10th change point:

And the 19th (of 30) change points.

Notice that A) lambda is constantly recomputed after every iteration of the main loop, as it’s updated with every new change point and that B) the values of lambda generally decrease as more change point candidates are found, such that the 19th change point is already on the border of the penalty boundary formed by the BIC. Unlike the math in the paper, this particular algorithm in R does not actually stop when lambda of k (that is, lambda[k]) is smaller than the penalty parameter, so if someone plugged in say, Q = 300, the algorithm would take around 30 seconds to run.

So, what’s the punch line to this approximate algorithm?

This:

 t1 <- Sys.time()
 binSegCss <- cpt.var(as.numeric(Return.calculate(Cl(GSPC))["1985::"]), 
                      method="BinSeg", test.stat="CSS", Q = 30)
 t2 <- Sys.time()
 print(t2 - t1)

The algorithm ran in a few seconds for me. Here is the output:

> binSegCss@cpts
 [1]  310  720  739  858 1825 2875 3192 4524 4762 5044 5850 6139 6199 6318 6548 6641 6868 6967 7328
> length(binSegCss@cpts)
[1] 19

Since the last change point is the length of the data, we’ll disregard that. In short, 18 change points (so that last picture of the four change points? That one and all subsequent ones fell in the realm of noise), which falls right into the ballpark of the post from Keplerian Finance.

So, as before, let’s create the cumulative sum of squares plot, the time series plot, and the volatility terrain map.

require(xtsExtra)
returns <- Return.calculate(Cl(GSPC))["1985::"]
dailySqReturns <- returns*returns
cumSqReturns <- cumsum(dailySqReturns)
plot(cumSqReturns)
xtsExtra::addLines(index(dailySqReturns)[binSegCss@cpts[-length(binSegCss@cpts)]], on = 1, col="blue", lwd = 2)

plot(Cl(GSPC)["1985::"])
xtsExtra::addLines(index(dailySqReturns)[binSegCss@cpts[-length(binSegCss@cpts)]], on = 1, col="blue", lwd = 2)


returns$regime <- NA
for(i in 1:(length(binSegCss@cpts)-1)) {
  returns$regime[binSegCss@cpts[i]] <- i
}
returns$regime <- na.locf(returns$regime)
returns$regime[is.na(returns$regime)] <- 0
returns$regime <- returns$regime + 1
returns$annVol <- NA
for(i in 1:max(unique(returns$regime))) {
  regime <- returns[returns$regime==i,]
  annVol <- StdDev.annualized(regime[,1])
  returns$annVol[returns$regime==i,] <- annVol
}

plot(returns$annVol)

Which gives us the following three images:



This last volatility map is even closer to the one found in Keplerian Finance, thus lending credibility to the technique.

In conclusion, it seems that the twitter breakout detection algorithm (e-divisive with medians) is “too robust” against the type of events in financial data, and thus does not detect enough change points. On the other hand, the PELT algorithm suffers from the opposite issues — by forcing the assumption of a popular distribution from probability theory (EG Normal, Exponential, Poisson, Gamma), PELT outputs far too many candidate change points, making its results highly suspect. However, a special case of the binary segmentation algorithm — the binary segmentation algorithm for variance using cumulative sum of squares — presents interesting results. Although the algorithm is a heuristic approximation, its fast running time and lack of distributional assumptions lend it usefulness for doing a “quick and dirty” analysis to potentially cut down on the running time of a more computationally-intensive changepoint detecting algorithm.

When this topic will be revisited, the segment neighborhood method will be examined.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Comparing Flexible and Elastic Asset Allocation

So recently, I tried to combine Flexible and Elastic Asset Allocation. The operative word being–tried. Essentially, I saw Flexible Asset Allocation as an incomplete algorithm — namely that although it was an excellent method for selecting securities, that there had to have been a better way to weigh stocks than a naive equal-weight scheme.

It turns out, the methods outlined in Elastic Asset Allocation weren’t doing the trick (that is, a four month cumulative return raised to the return weight multiplied by the correlation to a daily-rebalanced equal-weight index of the selected securities with cumulative return greater than zero). Either I managed a marginally higher return at the cost of much higher volatility and protracted drawdown, or I maintained my Sharpe ratio at the cost of much lower returns. Thus, I scrapped all of it, which was a shame as I was hoping to be able to combine the two methodologies into one system that would extend the research I previously blogged on. Instead, after scrapping it, I decided to have a look as to why I was running into the issues I was.

In any case, here’s the quick demo I did.

require(quantmod)
require(PerformanceAnalytics)
require(IKTrading)

symbols <- c("VTSMX", "FDIVX", "VEIEX", "VBMFX", "VFISX", "VGSIX", "QRAAX")

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))
ep <- endpoints(prices, "months")
adPrices <- prices
prices <- prices[ep,]
prices <- prices["1997-03::"]
adPrices <- adPrices["1997-04::"]

eaaOffensive <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset = "VBMFX", bestN = 3)
eaaOffNoCrash <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset ="VBMFX", 
                     bestN = 3, enableCrashProtection = FALSE)
faa <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = TRUE)
faaNoStepwise <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = FALSE)

eaaOffDaily <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffensive[[1]])
eaaOffNoCrash <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffNoCrash[[1]])
charts.PerformanceSummary(cbind(faa[[2]], eaaDaily))

comparison <- cbind(eaaOffDaily, eaaOffNoCrash, faa[[2]], faaNoStepwise[[2]])
colnames(comparison) <- c("Offensive EAA", "Offensive EAA (no crash protection)", "FAA (stepwise)", "FAA (no stepwise)")
charts.PerformanceSummary(comparison)

rbind(table.AnnualizedReturns(comparison), maxDrawdown(comparison))

Essentially, I compared FAA with the stepwise correlation rank algorithm, without it, and the offensive EAA with and without crash protection. The results were disappointing.

Here are the equity curves:

In short, the best default FAA variant handily outperforms any of the EAA variants.

And here are the statistics:

                          Offensive EAA Offensive EAA (no crash protection) FAA (stepwise) FAA (no stepwise)
Annualized Return             0.1247000                           0.1305000      0.1380000          0.131400
Annualized Std Dev            0.1225000                           0.1446000      0.0967000          0.089500
Annualized Sharpe (Rf=0%)     1.0187000                           0.9021000      1.4271000          1.467900
Worst Drawdown                0.1581859                           0.2696754      0.1376124          0.130865

Note of warning: if you run EAA, it seems it’s unwise to do it without crash protection (aka decreasing your stake in everything but the cash/risk free asset by a proportion of the number of negative return securities). I didn’t include the defensive variant of EAA since that gives markedly lower returns.

Not that this should discredit EAA, but on a whole, I feel that there should probably be a way to beat the (usually) equal-weight weighting scheme (sometimes the cash asset gets a larger value due to a negative momentum asset making it into the top assets by virtue of the rank of its volatility and correlation, and ends up getting zeroed out) that FAA employs, and that treating FAA as an asset selection mechanism as opposed to a weighting mechanism may yield some value. However, I have not yet found it myself.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.