A Closer Update To David Varadi’s Percentile Channels Strategy

So thanks to seeing Michael Kapler’s implementation of David Varadi’s percentile channels strategy, I was able to get a better understanding of what was going on. It turns out that rather than looking at the channel value only at the ends of months, that the strategy actually keeps track of the channel’s value intra-month. So if in the middle of the month, you had a sell signal and at the end of the month, the price moved up to intra-channel values, you would still be on a sell signal rather than the previous month’s end-of-month signal. It’s not much different than my previous implementation when all is said and done (slightly higher Sharpe, slightly lower returns and drawdowns). In any case, the concept remains the same.

For this implementation, I’m going to use the runquantile function from the caTools package, which contains a function called runquantile that works like a generalized runMedian/runMin/runMax from TTR, once you’re able to give it the proper arguments (on default, its results are questionable).

Here’s the code:

require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- cbind(Ad(LQD), Ad(DBC), Ad(VTI), Ad(ICF), Ad(SHY))
prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]

require(caTools)
pctChannelPosition <- function(prices,
                               dayLookback = 60, 
                               lowerPct = .25, upperPct = .75) {
  leadingNAs <- matrix(nrow=dayLookback-1, ncol=ncol(prices), NA)
  
  upperChannels <- runquantile(prices, k=dayLookback, probs=upperPct, endrule="trim")
  upperQ <- xts(rbind(leadingNAs, upperChannels), order.by=index(prices))
  
  lowerChannels <- runquantile(prices, k=dayLookback, probs=lowerPct, endrule="trim")
  lowerQ <- xts(rbind(leadingNAs, lowerChannels), order.by=index(prices))
  
  positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
  positions[prices > upperQ & lag(prices) < upperQ] <- 1 #cross up
  positions[prices < lowerQ & lag(prices) > lowerQ] <- -1 #cross down
  positions <- na.locf(positions)
  positions[is.na(positions)] <- 0
  
  colnames(positions) <- colnames(prices)
  return(positions)
}

#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4

compositeMonths <- compositePosition[endpoints(compositePosition, on="months"),]

returns <- Return.calculate(prices)
monthlySD20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(prices))[index(compositeMonths),]
weight <- compositeMonths*1/monthlySD20
weight <- abs(weight)/rowSums(abs(weight))
weight[compositeMonths < 0 | is.na(weight)] <- 0
weight$CASH <- 1-rowSums(weight)

#not actually equal weight--more like composite weight, going with 
#Michael Kapler's terminology here
ewWeight <- abs(compositeMonths)/rowSums(abs(compositeMonths))
ewWeight[compositeMonths < 0 | is.na(ewWeight)] <- 0
ewWeight$CASH <- 1-rowSums(ewWeight)

rpRets <- Return.portfolio(R = returns, weights = weight)
ewRets <- Return.portfolio(R = returns, weights = ewWeight)

Essentially, with runquantile, you need to give it the “trim” argument, and then manually append the leading NAs, and then manually turn it into an xts object, which is annoying. One would think that the author of this package would take care of these quality-of-life issues, but no. In any case, there are two strategies at play here–one being the percentile channel risk parity strategy, and the other what Michael Kapler calls “channel equal weight”, which actually *isn’t* an equal weight strategy, since the composite parameter values may take the values (-1, -.5, 0, .5, and 1–with a possibility for .75 or .25 early on when some of the lookback channels still say 0 instead of only 1 or -1), but simply, the weights without taking into account volatility at all, but I’m sticking with Michael Kapler’s terminology to be consistent. That said, I don’t personally use Michael Kapler’s SIT package due to the vast differences in syntax between it and the usual R code I’m used to. However, your mileage may vary.

In any case, here’s the updated performance:

both <- cbind(rpRets, ewRets)
colnames(both) <- c("RiskParity", "Equal Weight")
charts.PerformanceSummary(both)
rbind(table.AnnualizedReturns(both), maxDrawdown(both))
apply.yearly(both, Return.cumulative)

Which gives us the following output:

> rbind(table.AnnualizedReturns(both), maxDrawdown(both))
                          RiskParity Equal Weight
Annualized Return         0.09380000    0.1021000
Annualized Std Dev        0.06320000    0.0851000
Annualized Sharpe (Rf=0%) 1.48430000    1.1989000
Worst Drawdown            0.06894391    0.1150246

> apply.yearly(both, Return.cumulative)
           RiskParity Equal Weight
2006-12-29 0.08352255   0.07678321
2007-12-31 0.05412147   0.06475540
2008-12-31 0.10663085   0.12212063
2009-12-31 0.11920721   0.19093131
2010-12-31 0.13756460   0.14594317
2011-12-30 0.11744706   0.08707801
2012-12-31 0.07730896   0.06085295
2013-12-31 0.06733187   0.08174173
2014-12-31 0.06435030   0.07357458
2015-02-17 0.01428705   0.01568372

In short, the more naive weighting scheme delivers slightly higher returns but pays dearly for those marginal returns with downside risk.

Here are the equity curves:

So, there you have it. The results David Varadi obtained are legitimate. But nevertheless, I hope this demonstrates how easy it is for the small details to make material differences.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Advertisements

An Attempt At Replicating David Varadi’s Percentile Channels Strategy

This post will detail an attempt at replicating David Varadi’s percentile channels strategy. As I’m only able to obtain data back to mid 2006, the exact statistics will not be identical. However, of the performance I do have, it is similar (but not identical) to the corresponding performance presented by David Varadi.

First off, before beginning this post, I’d like to issue a small mea culpa regarding the last post. It turns out that Yahoo’s data, once it gets into single digit dollar prices, is of questionable accuracy, and thus, results from the late 90s on mutual funds with prices falling into those ranges are questionable, as a result. As I am an independent blogger, and also make it a policy of readers being able to replicate all of my analysis, I am constrained by free data sources, and sometimes, the questionable quality of that data may materially affect results. So, if it’s one of your strategies replicated on this blog, and you find contention with my results, I would be more than happy to work with the data used to generate the original results, corroborate the results, and be certain that any differences in results from using lower-quality, publicly-available data stem from that alone. Generally, I find it surprising that a company as large as Yahoo can have such gaping data quality issues in certain aspects, but I’m happy that I was able to replicate the general thrust of QTS very closely.

This replication of David Varadi’s strategy, however, is not one such case–mainly because the data for DBC does not extend back very far (it was in inception only in 2006, and the data used by David Varadi’s programmer was obtained from Bloomberg, which I have no access to), and furthermore, I’m not certain if my methods are absolutely identical. Nevertheless, the strategy in and of itself is solid.

The way the strategy works is like this (to my interpretation of David Varadi’s post and communication with his other programmer). Given four securities (LQD, DBC, VTI, ICF), and a cash security (SHY), do the following:

Find the running the n-day quantile of an upper and lower percentile. Anything above the upper percentile gets a score of 1, anything lower gets a score of -1. Leave the rest as NA (that is, anything between the bounds).

Subset these quantities on their monthly endpoints. Any value between channels (NA) takes the quantity of the last value. (In short, na.locf). Any initial NAs become zero.

Do this with a 60-day, 120-day, 180-day, and 252-day setting at 25th and 75th percentiles. Add these four tables up (their dimensions are the number of monthly endpoints by the number of securities) and divide by the number of parameter settings (in this case, 4 for 60, 120, 180, 252) to obtain a composite position.

Next, obtain a running 20-day standard deviation of the returns (not prices!), and subset it for the same indices as the composite positions. Take the inverse of these volatility scores, and multiply it by the composite positions to get an inverse volatility position. Take its absolute value (some positions may be negative, remember), and normalize. In the beginning, there may be some zero-across-all-assets positions, or other NAs due to lack of data (EG if a monthly endpoint occurs before enough data to compute a 20-day standard deviation, there will be a row of NAs), which will be dealt with. Keep all positions with a positive composite position (that is, scores of .5 or 1, discard all scores of zero or lower), and reinvest the remainder into the cash asset (SHY, in our case). Those are the final positions used to generate the returns.

This is how it looks like in code.

This is the code for obtaining the data (from Yahoo finance) and separating it into cash and non-cash data.

require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- cbind(Ad(LQD), Ad(DBC), Ad(VTI), Ad(ICF), Ad(SHY))
prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]

This is the function for computing the percentile channel positions for a given parameter setting. Unfortunately, it is not instantaneous due to R’s rollapply function paying a price in speed for generality. While the package caTools has a runquantile function, as of the time of this writing, I have found differences between its output and runMedian in TTR, so I’ll have to get in touch with the package’s author.

pctChannelPosition <- function(prices, rebal_on=c("months", "quarters"),
                             dayLookback = 60, 
                             lowerPct = .25, upperPct = .75) {
  
  upperQ <- rollapply(prices, width=dayLookback, quantile, probs=upperPct)
  lowerQ <- rollapply(prices, width=dayLookback, quantile, probs=lowerPct)
  positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
  positions[prices > upperQ] <- 1
  positions[prices < lowerQ] <- -1
  
  ep <- endpoints(positions, on = rebal_on[1])
  positions <- positions[ep,]
  positions <- na.locf(positions)
  positions[is.na(positions)] <- 0 
  
  colnames(positions) <- colnames(prices)
  return(positions)
}

The way this function works is simple: computes a running quantile using rollapply, and then scores anything with price above its 75th percentile as 1, and anything below the 25th percentile as -1, in accordance with David Varadi’s post.

It then subsets these quantities on months (quarters is also possible–or for that matter, other values, but the spirit of the strategy seems to be months or quarters), and imputes any NAs with the last known observation, or zero, if it is an initial NA before any position is found. Something I have found over the course of writing this and the QTS strategy is that one need not bother implementing a looping mechanism to allocate positions monthly if there isn’t a correlation matrix based on daily data involved every month, and it makes the code more readable.

Next, we find our composite position.

#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4

Next, find the running volatility for the assets, and subset them to the same time period (in this case months) as our composite position. In David Varadi’s example, the parameter is a 20-day lookback.

#find 20-day rolling standard deviations, subset them on identical indices
#to the percentile channel monthly positions
sd20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(assetPrices))
monthlySDs <- sd20[index(compositePosition)]

Next, perform the following steps: find the inverse volatility of these quantities, multiply by the composite position score, take the absolute value, and keep any position for which the composite position is greater than zero (or technically speaking, has positive signage). Due to some initial NA rows due to a lack of data (either not enough days to compute a running volatility, or no positive positions yet), those will simply be imputed to zero. Reinvest the remainder in cash.

#compute inverse volatilities
inverseVols <- 1/monthlySDs

#multiply inverse volatilities by composite positions
invVolPos <- inverseVols*compositePosition

#take absolute values of inverse volatility multiplied by position
absInvVolPos <- abs(invVolPos)

#normalize the above quantities
normalizedAbsInvVols <- absInvVolPos/rowSums(absInvVolPos)

#keep only positions with positive composite positions (remove zeroes/negative)
nonCashPos <- normalizedAbsInvVols * sign(compositePosition > 0)
nonCashPos[is.na(nonCashPos)] <- 0 #no positions before we have enough data

#add cash position which is complement of non-cash position
finalPos <- nonCashPos
finalPos$cashPos <- 1-rowSums(finalPos)

And finally, the punchline, how does this strategy perform?

#compute returns
stratRets <- Return.portfolio(R = returns, weights = finalPos)
charts.PerformanceSummary(stratRets)
stats <- rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
rownames(stats)[4] <- "Worst Drawdown"
stats

Like this:

> stats
                          portfolio.returns
Annualized Return                0.10070000
Annualized Std Dev               0.06880000
Annualized Sharpe (Rf=0%)        1.46530000
Worst Drawdown                   0.07449537

With the following equity curve:

The statistics are visibly worse than David Varadi’s 10% vs. 11.1% CAGR, 6.9% annualized standard deviation vs. 5.72%, 7.45% max drawdown vs. 5.5%, and derived statistics (EG MAR). However, my data starts far later, and 1995-1996 seemed to be phenomenal for this strategy. Here are the cumulative returns for the data I have:

> apply.yearly(stratRets, Return.cumulative)
           portfolio.returns
2006-12-29        0.11155069
2007-12-31        0.07574266
2008-12-31        0.16921233
2009-12-31        0.14600008
2010-12-31        0.12996371
2011-12-30        0.06092018
2012-12-31        0.07306617
2013-12-31        0.06303612
2014-12-31        0.05967415
2015-02-13        0.01715446

I see a major discrepancy between my returns and David’s returns in 2011, but beyond that, the results seem to be somewhere close in the pattern of yearly returns. Whether my methodology is incorrect (I think I followed the procedure to the best of my understanding, but of course, if someone sees a mistake in my code, please let me know), or whether it’s the result of using Yahoo’s questionable quality data, I am uncertain.

However, in my opinion, that doesn’t take away from the validity of the strategy as a whole. With a mid-1 Sharpe ratio on a monthly rebalancing scale, and steady new equity highs, I feel that this is a result worth sharing–even if not directly corroborated (yet, hopefully).

One last note–some of the readers on David Varadi’s blog have cried foul due to their inability to come close to his results. Since I’ve come close, I feel that the results are valid, and since I’m using different data, my results are not identical. However, if anyone has questions about my process, feel free to leave questions and/or comments.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Comparing Flexible and Elastic Asset Allocation

So recently, I tried to combine Flexible and Elastic Asset Allocation. The operative word being–tried. Essentially, I saw Flexible Asset Allocation as an incomplete algorithm — namely that although it was an excellent method for selecting securities, that there had to have been a better way to weigh stocks than a naive equal-weight scheme.

It turns out, the methods outlined in Elastic Asset Allocation weren’t doing the trick (that is, a four month cumulative return raised to the return weight multiplied by the correlation to a daily-rebalanced equal-weight index of the selected securities with cumulative return greater than zero). Either I managed a marginally higher return at the cost of much higher volatility and protracted drawdown, or I maintained my Sharpe ratio at the cost of much lower returns. Thus, I scrapped all of it, which was a shame as I was hoping to be able to combine the two methodologies into one system that would extend the research I previously blogged on. Instead, after scrapping it, I decided to have a look as to why I was running into the issues I was.

In any case, here’s the quick demo I did.

require(quantmod)
require(PerformanceAnalytics)
require(IKTrading)

symbols <- c("VTSMX", "FDIVX", "VEIEX", "VBMFX", "VFISX", "VGSIX", "QRAAX")

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))
ep <- endpoints(prices, "months")
adPrices <- prices
prices <- prices[ep,]
prices <- prices["1997-03::"]
adPrices <- adPrices["1997-04::"]

eaaOffensive <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset = "VBMFX", bestN = 3)
eaaOffNoCrash <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset ="VBMFX", 
                     bestN = 3, enableCrashProtection = FALSE)
faa <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = TRUE)
faaNoStepwise <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = FALSE)

eaaOffDaily <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffensive[[1]])
eaaOffNoCrash <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffNoCrash[[1]])
charts.PerformanceSummary(cbind(faa[[2]], eaaDaily))

comparison <- cbind(eaaOffDaily, eaaOffNoCrash, faa[[2]], faaNoStepwise[[2]])
colnames(comparison) <- c("Offensive EAA", "Offensive EAA (no crash protection)", "FAA (stepwise)", "FAA (no stepwise)")
charts.PerformanceSummary(comparison)

rbind(table.AnnualizedReturns(comparison), maxDrawdown(comparison))

Essentially, I compared FAA with the stepwise correlation rank algorithm, without it, and the offensive EAA with and without crash protection. The results were disappointing.

Here are the equity curves:

In short, the best default FAA variant handily outperforms any of the EAA variants.

And here are the statistics:

                          Offensive EAA Offensive EAA (no crash protection) FAA (stepwise) FAA (no stepwise)
Annualized Return             0.1247000                           0.1305000      0.1380000          0.131400
Annualized Std Dev            0.1225000                           0.1446000      0.0967000          0.089500
Annualized Sharpe (Rf=0%)     1.0187000                           0.9021000      1.4271000          1.467900
Worst Drawdown                0.1581859                           0.2696754      0.1376124          0.130865

Note of warning: if you run EAA, it seems it’s unwise to do it without crash protection (aka decreasing your stake in everything but the cash/risk free asset by a proportion of the number of negative return securities). I didn’t include the defensive variant of EAA since that gives markedly lower returns.

Not that this should discredit EAA, but on a whole, I feel that there should probably be a way to beat the (usually) equal-weight weighting scheme (sometimes the cash asset gets a larger value due to a negative momentum asset making it into the top assets by virtue of the rank of its volatility and correlation, and ends up getting zeroed out) that FAA employs, and that treating FAA as an asset selection mechanism as opposed to a weighting mechanism may yield some value. However, I have not yet found it myself.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

An Update on Flexible Asset Allocation

A few weeks back, after seeing my replication, one of the original authors of the Flexible Asset Allocation paper got in touch with me to tell me to make a slight adjustment to the code, in that rather than remove any negative-momentum securities before performing any ranking, to perform all ranking without taking absolute momentum into account, and only removing negative absolute momentum securities at the very end, after allocating weights.

Here’s the new code:

FAA <- function(prices, monthLookback = 4,
                weightMom = 1, weightVol = .5, weightCor = .5, 
                riskFreeName = NULL, bestN = 3,
                stepCorRank = FALSE, stepStartMethod = c("best", "default"),
                geometric = TRUE, ...) {
  stepStartMethod <- stepStartMethod[1]
  if(is.null(riskFreeName)) {
    prices$zeroes <- 0
    riskFreeName <- "zeroes"
    warning("No risk-free security specified. Recommended to use one of: quandClean('CHRIS/CME_US'), SHY, or VFISX. 
            Using vector of zeroes instead.")
  }
  returns <- Return.calculate(prices)
  monthlyEps <- endpoints(prices, on = "months")
  riskFreeCol <- grep(riskFreeName, colnames(prices))
  tmp <- list()
  dates <- list()
  
  for(i in 2:(length(monthlyEps) - monthLookback)) {
    #subset data
    priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
    returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]
    
    #perform computations
    momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
    momentum <- momentum[,!is.na(momentum)]
    #momentum[is.na(momentum)] <- -1 #set any NA momentum to negative 1 to keep R from crashing
    priceData <- priceData[,names(momentum)]
    returnsData <- returnsData[,names(momentum)]
    
    momRank <- rank(momentum)
    vols <- data.frame(StdDev(returnsData))
    volRank <- rank(-vols)
    cors <- cor(returnsData, use = "complete.obs")
    if (stepCorRank) {
      if(stepStartMethod=="best") {
        compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
        maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
        corRank <- stepwiseCorRank(corMatrix=cors, startNames = names(maxRank), 
                                        bestHighestRank = TRUE, ...)
        
      } else {
        corRank <- stepwiseCorRank(corMatrix=cors, bestHighestRank=TRUE, ...)
      }
    } else {
      corRank <- rank(-rowSums(cors))
    }
    
    totalRank <- rank(weightMom*momRank + weightVol*volRank + weightCor*corRank)
    
    upper <- length(names(returnsData))
    lower <- max(upper-bestN+1, 1)
    topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]
    
    #compute weights
    longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
    longs[momentum < 0] <- 0 #in previous algorithm, removed momentums < 0, this time, we zero them out at the end.
    longs <- longs/sum(longs) #equal weight all candidates
    longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
    names(longs) <- names(totalRank)
    
    
    #append removed names (those with momentum < 0)
    removedZeroes <- rep(0, ncol(returns)-length(longs))
    names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
    longs <- c(longs, removedZeroes)
    
    #reorder to be in the same column order as original returns/prices
    longs <- data.frame(t(longs))
    longs <- longs[, names(returns)]
    
    #append lists
    tmp[[i]] <- longs
    dates[[i]] <- index(returnsData)[nrow(returnsData)]
  }
  
  weights <- do.call(rbind, tmp)
  dates <- do.call(c, dates)
  weights <- xts(weights, order.by=as.Date(dates)) 
  weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
  strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = geometric)
  colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
  return(strategyReturns)
}

And here are the new results, both with the original configuration, and using the stepwise correlation ranking algorithm introduced by David Varadi:

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
                 "FDIVX", #Fidelity Diversified International Fund
                 "VEIEX", #Vanguard Emerging Markets Stock Index Fund
                 "VFISX", #Vanguard Short-Term Treasury Fund
                 "VBMFX", #Vanguard Total Bond Market Index Fund
                 "QRAAX", #Oppenheimer Commodity Strategy Total Return 
                 "VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2014-10-30")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

original <- FAA(adPrices, riskFreeName="VFISX")
swc <- FAA(adPrices, riskFreeName="VFISX", stepCorRank = TRUE)
originalOld <- FAAreturns(adPrices, riskFreeName="VFISX")
swcOld <- FAAreturns(adPrices, riskFreeName="VFISX", stepCorRank=TRUE)
all4 <- cbind(original, swc, originalOld, swcOld)
names(all4) <- c("original", "swc", "origOld", "swcOld")
charts.PerformanceSummary(all4)
> rbind(Return.annualized(all4)*100,
+       maxDrawdown(all4)*100,
+       SharpeRatio.annualized(all4))
                                 original       swc   origOld    swcOld
Annualized Return               12.795205 14.135997 13.221775 14.037137
Worst Drawdown                  11.361801 11.361801 13.082294 13.082294
Annualized Sharpe Ratio (Rf=0%)  1.455302  1.472924  1.377914  1.390025

And the resulting equity curve comparison

Overall, it seems filtering on absolute momentum after applying all weightings using only relative momentum to rank actually improves downside risk profiles ever so slightly compared to removing negative momentum securities ahead of time. In any case, FAAreturns will be the function that removes negative momentum securities ahead of time, and FAA will be the ones that removes them after all else is said and done.

I’ll return to the standard volatility trading agenda soon.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Predicting High Yield with SPY–a Two Part Post

This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some volatility trading strategies (the extent of which I hope will end up as a workable trading strategy–my current replica of some of VolatilityMadeSimple’s publicly displayed “example” strategies (note, from other blogs, not to be confused with their proprietary strategy) are something that I think is too risky to be traded as-is), and Cesar Alvarez, of Alvarez Quant Trading. If his name sounds familiar to some of you, that’s because it should. He used to collaborate (still does?) with Larry Connors of TradingMarkets.com, and I’m pretty sure that sometime in the future, I’ll cover those strategies as well.

The strategy for this post is simple, and taken from this post from CSS Analytics.

Pretty straightforward–compute a 20-day SMA on the SPY (I use unadjusted since that’s what the data would have actually been). When the SPY’s close crosses above the 20-day SMA, buy the high-yield bond index, either VWEHX or HYG, and when the converse happens, move to the cash-substitute security, either VFISX or SHY.

Now, while the above paragraph may make it seem that VWEHX and HYG are perfect substitutes, well, they aren’t, as no two instruments are exactly alike, which, as could be noted from my last post, is a detail that one should be mindful of. Even creating a synthetic “equivalent” is never exactly perfect. Even though I try my best to iron out such issues, over the course of generally illustrating an idea, the numbers won’t line up exactly (though hopefully, they’ll be close). In any case, it’s best to leave an asterisk whenever one is forced to use synthetics for the sake of a prolonged backtest.

The other elephant/gorilla in the room (depending on your preference for metaphorical animals), is whether or not to use adjusted data. The upside to that is that dividends are taken into account. The *downside* to that is that the data isn’t the real data, and also assumes a continuous reinvestment of dividends. Unfortunately, shares of a security are not continuous quantities–they are discrete quantities made so by their unit price, so the implicit assumptions in adjusted prices can be optimistic.

For this particular topic, Cesar Alvarez covered it exceptionally well on his blog post, and I highly recommend readers give that post a read, in addition to following his blog in general. However, just to illustrate the effect, let’s jump into the script.


getSymbols("VWEHX", from="1950-01-01")
getSymbols("SPY", from="1900-01-01")
getSymbols("HYG", from="1990-01-01")
getSymbols("SHY", from="1990-01-01")
getSymbols("VFISX", from="1990-01-01")

spySma20Cl <- SMA(Cl(SPY), n=20)
clSig <- Cl(SPY) > spySma20Cl
clSig <- lag(clSig, 1)

vwehxCloseRets <- Return.calculate(Cl(VWEHX))
vfisxCloseRets <- Return.calculate(Cl(VFISX))
vwehxAdjustRets <- Return.calculate(Ad(VWEHX))
vfisxAdjustRets <- Return.calculate(Ad(VFISX))

hygCloseRets <- Return.calculate(Cl(HYG))
shyCloseRets <- Return.calculate(Cl(SHY))
hygAdjustRets <- Return.calculate(Ad(HYG))
shyAdjustRets <- Return.calculate(Ad(SHY))

mutualAdRets <- vwehxAdjustRets*clSig + vfisxAdjustRets*(1-clSig)
mutualClRets <- vwehxCloseRets*clSig + vfisxCloseRets*(1-clSig)

etfAdRets <- hygAdjustRets*clSig + shyAdjustRets*(1-clSig)
etfClRets <- hygCloseRets*clSig + shyCloseRets*(1-clSig)

Here are the results:


mutualFundBacktest <- merge(mutualAdRets, mutualClRets, join='inner')
charts.PerformanceSummary(mutualFundBacktest)
data.frame(t(rbind(Return.annualized(mutualFundBacktest)*100, 
                   maxDrawdown(mutualFundBacktest)*100,
                   SharpeRatio.annualized(mutualFundBacktest))))

Which produces the following equity curves:

As can be seen, the choice to adjust or not can be pretty enormous. Here are the corresponding three statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         14.675379       2.954519                        3.979383
VWEHX.Close             7.794086       4.637520                        3.034225

Even without the adjustment, the strategy itself is…very very good, at least from this angle. Let’s look at the ETF variant now.

etfBacktest <- merge(etfAdRets, etfClRets, join='inner')
charts.PerformanceSummary(etfBacktest)
data.frame(t(rbind(Return.annualized(etfBacktest)*100, 
                   maxDrawdown(etfBacktest)*100,
                   SharpeRatio.annualized(etfBacktest))))

The resultant equity curve:

With the corresponding statistics:

             Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
HYG.Adjusted         11.546005       6.344801                       1.4674301
HYG.Close             5.530951       9.454754                       0.6840059

Again, another stark difference. Let’s combine all four variants.

fundsAndETFs <- merge(mutualFundBacktest, etfBacktest, join='inner')
charts.PerformanceSummary(fundsAndETFs)
data.frame(t(rbind(Return.annualized(fundsAndETFs)*100, 
                   maxDrawdown(fundsAndETFs)*100,
                   SharpeRatio.annualized(fundsAndETFs))))

The equity curve:

With the resulting statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         17.424070       2.787889                       4.7521579
VWEHX.Close            11.739849       3.169040                       3.8715923
HYG.Adjusted           11.546005       6.344801                       1.4674301
HYG.Close               5.530951       9.454754                       0.6840059

In short, while the strategy itself seems strong, the particular similar (but not identical) instruments used to implement the strategy make a large difference. So, when backtesting, make sure to understand what taking liberties with the data means. In this case, by turning two levers, the Sharpe Ratio varied from less than 1 to above 4.

Next, I’d like to demonstrate a little trick in quantstrat. Although plenty of examples of trading strategies only derive indicators (along with signals and rules) from the market data itself, there are also many strategies that incorporate data from outside simply the price action of the particular security at hand. Such examples would be many SPY strategies that incorporate VIX information, or off-instrument signal strategies like this one.

The way to incorporate off-instrument information into quantstrat simply requires understanding what the mktdata object is, which is nothing more than an xts type object. By default, a security may originally have just the OHLCV and open interest columns. Most demos in the public space generally use data only from the instruments themselves. However, it is very much possible to actually pre-compute signals.

Here’s a continuation of the script to demonstrate, with a demo of the unadjusted HYG leg of this trade:

####### BOILERPLATE FROM HERE
require(quantstrat)

currency('USD')
Sys.setenv(TZ="UTC")
symbols <- "HYG"
stock(symbols, currency="USD", multiplier=1)
initDate="1990-01-01"

strategy.st <- portfolio.st <- account.st <- "preCalc"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD')
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)
######### TO HERE

clSig <- Cl(SPY) > SMA(Cl(SPY), n=20)

HYG <- merge(HYG, clSig, join='inner')
names(HYG)[7] <- "precomputed_signal"

#no parameters or indicators--we precalculated our signal

add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="gt", cross=TRUE),
           label="longEntry")


add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="lt", cross=TRUE),
           label="longExit")

add.rule(strategy.st, name="ruleSignal",
         arguments=list(sigcol="longEntry", sigval=TRUE, orderqty=1, ordertype="market",
                        orderside="long", replace=FALSE, prefer="Open"),
         type="exit", path.dep=TRUE)

add.rule(strategy.st, name="ruleSignal", 
         arguments=list(sigcol="longExit", sigval=TRUE, orderqty="all", ordertype="market", 
                        orderside="long", replace=FALSE, prefer="Open"), 
         type="exit", path.dep=TRUE)

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)

#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)

As you can see, no indicators computed from the actual market data, because the strategy used a pre-computed value to work off of. The lowest-hanging fruit of applying this methodology, of course, would be to append the VIX index as an indicator for trading strategies on the SPY.

And here are the results, trading a unit quantity:

> data.frame(round(t(tradeStats(portfolio.st)[-c(1,2)]),2))
                      HYG
Num.Txns           217.00
Num.Trades         106.00
Net.Trading.PL      36.76
Avg.Trade.PL         0.35
Med.Trade.PL         0.01
Largest.Winner       9.83
Largest.Loser       -2.71
Gross.Profits       67.07
Gross.Losses       -29.87
Std.Dev.Trade.PL     1.67
Percent.Positive    50.00
Percent.Negative    50.00
Profit.Factor        2.25
Avg.Win.Trade        1.27
Med.Win.Trade        0.65
Avg.Losing.Trade    -0.56
Med.Losing.Trade    -0.39
Avg.Daily.PL         0.35
Med.Daily.PL         0.01
Std.Dev.Daily.PL     1.67
Ann.Sharpe           3.33
Max.Drawdown        -7.24
Profit.To.Max.Draw   5.08
Avg.WinLoss.Ratio    2.25
Med.WinLoss.Ratio    1.67
Max.Equity          43.78
Min.Equity          -1.88
End.Equity          36.76

And the corresponding position chart:

Lastly, here are the vanguard links for VWEHX and VFISX. Apparently, neither charge a redemption fee. I’m not sure if this means that they can be freely traded in a systematic fashion, however.

In conclusion, hopefully this post showed a potentially viable strategy, understanding the nature of the data you’re working with, and how to pre-compute values in quantstrat.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Combining FAA and Stepwise Correlation

Since I debuted the stepwise correlation algorithm, I suppose the punchline that people want to see is: does it actually work?

The short answer? Yes, it does.

A slightly longer answer: it works, With the caveat that having a better correlation algorithm that makes up 25% of the total sum of weighted ranks only has a marginal (but nevertheless positive) effect on returns. Furthermore, when comparing a max decorrelation weighting using default single-pass correlation vs. stepwise, the stepwise gives a bumpier ride, but one with visibly larger returns. Furthermore, for this universe, the difference between starting at the security ranked highest by the momentum and volatility components, or with the security that has the smallest aggregate correlation to all securities, is very small. Essentially, from my inspection, the answer to using stepwise correlation is: “it’s a perfectly viable alternative, if not better.”

Here are the functions used in the script:

require(quantmod)
require(PerformanceAnalytics)

stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
  #edge cases
  if(dim(corMatrix)[1] == 1) {
    return(corMatrix)
  } else if (dim(corMatrix)[1] == 2) {
    ranks <- c(1.5, 1.5)
    names(ranks) <- colnames(corMatrix)
    return(ranks)
  }
  
  if(is.null(startNames)) {
    corSums <- rowSums(corMatrix)
    corRanks <- rank(corSums)
    startNames <- names(corRanks)[corRanks <= stepSize]
  }
  nameList <- list()
  nameList[[1]] <- startNames
  rankList <- list()
  rankCount <- 1
  rankList[[1]] <- rep(rankCount, length(startNames))
  rankedNames <- do.call(c, nameList)
  
  while(length(rankedNames) < nrow(corMatrix)) {
    rankCount <- rankCount+1
    subsetCor <- corMatrix[, rankedNames]
    if(class(subsetCor) != "numeric") {
      subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
      if(class(subsetCor) != "numeric") {
        corSums <- rowSums(subsetCor)
        corSumRank <- rank(corSums)
        lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
        nameList[[rankCount]] <- lowestCorNames
        rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
      } else { #1 name remaining
        nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
        rankList[[rankCount]] <- rankCount
      }
    } else {  #first iteration, subset on first name
      subsetCorRank <- rank(subsetCor)
      lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
      nameList[[rankCount]] <- lowestCorNames
      rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
    }    
    rankedNames <- do.call(c, nameList)
  }
  
  ranks <- do.call(c, rankList)
  names(ranks) <- rankedNames
  if(bestHighestRank) {
    ranks <- 1+length(ranks)-ranks
  }
  ranks <- ranks[colnames(corMatrix)] #return to original order
  return(ranks)
}


FAAreturns <- function(prices, monthLookback = 4,
                       weightMom=1, weightVol=.5, weightCor=.5, 
                       riskFreeName="VFISX", bestN=3,
                       stepCorRank = FALSE, stepStartMethod=c("best", "default")) {
  stepStartMethod <- stepStartMethod[1]
  returns <- Return.calculate(prices)
  monthlyEps <- endpoints(prices, on = "months")
  riskFreeCol <- grep(riskFreeName, colnames(prices))
  tmp <- list()
  dates <- list()
  
  for(i in 2:(length(monthlyEps) - monthLookback)) {
    
    #subset data
    priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
    returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]
    
    #perform computations
    momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
    priceData <- priceData[, momentum > 0] #remove securities with momentum < 0
    returnsData <- returnsData[, momentum > 0]
    momentum <- momentum[momentum > 0]
    names(momentum) <- colnames(returnsData)
    vol <- as.numeric(-sd.annualized(returnsData))
    
    if(length(momentum) > 1) {
      
      #perform ranking
      if(!stepCorRank) {
        sumCors <- -colSums(cor(returnsData, use="complete.obs"))
        stats <- data.frame(cbind(momentum, vol, sumCors))
        ranks <- data.frame(apply(stats, 2, rank))
        weightRankSum <- weightMom*ranks$momentum + weightVol*ranks$vol + weightCor*ranks$sumCors
        names(weightRankSum) <- rownames(ranks)
      } else {
        corMatrix <- cor(returnsData, use="complete.obs")
        momRank <- rank(momentum)
        volRank <- rank(vol)
        compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
        maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
        if(stepStartMethod=="default") {
          stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = NULL, 
                                          stepSize = 1, bestHighestRank = TRUE)
        } else {
          stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = names(maxRank), 
                                          stepSize = 1, bestHighestRank = TRUE)
        }
        weightRankSum <- weightMom*momRank + weightVol*volRank + weightCor*stepCorRanks
      }
      
      totalRank <- rank(weightRankSum)
      
      #find top N values, from http://stackoverflow.com/questions/2453326/fastest-way-to-find-second-third-highest-lowest-value-in-vector-or-column
      #thanks to Dr. Rob J. Hyndman
      upper <- length(names(returnsData))
      lower <- max(upper-bestN+1, 1)
      topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]
      
      #compute weights
      longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
      longs <- longs/sum(longs) #equal weight all candidates
      longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
      names(longs) <- names(totalRank)
      
    } else if(length(momentum) == 1) { #only one security had positive momentum 
      longs <- 1/bestN
      names(longs) <- names(momentum)
    } else { #no securities had positive momentum 
      longs <- 1
      names(longs) <- riskFreeName
    }
    
    #append removed names (those with momentum < 0)
    removedZeroes <- rep(0, ncol(returns)-length(longs))
    names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
    longs <- c(longs, removedZeroes)
    
    #reorder to be in the same column order as original returns/prices
    longs <- data.frame(t(longs))
    longs <- longs[, names(returns)]
    
    #append lists
    tmp[[i]] <- longs
    dates[[i]] <- index(returnsData)[nrow(returnsData)]
  }
  
  weights <- do.call(rbind, tmp)
  dates <- do.call(c, dates)
  weights <- xts(weights, order.by=as.Date(dates)) 
  weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
  strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = FALSE)
  colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
  return(strategyReturns)
}

The FAAreturns function has been modified to transplant the stepwise correlation algorithm I discussed earlier. Essentially, the chunk of code that performs the ranking inside the function got a little bit larger, and some new arguments to the function have been introduced.

First off, there’s the option to use the stepwise correlation algorithm in the first place–namely, the stepCorRank defaulting to FALSE (the default settings replicate the original FAA idea demonstrated in the first post on this idea). However, the argument that comes next, the stepStartMethod argument does the following:

Using the “default” setting, the algorithm will start off using the security that is simply least correlated among the securities (that is, the lowest sum of correlations among securities). However, the “best” setting instead will use the weighted sum of ranks using the prior two factors (momentum and volatility). This argument defaults to using the best security (aka the one best ranked prior by the previous two factors), as opposed to the default. At the end of the day, I suppose the best way of illustrating functionality is with some examples of taking this piece of engineering out for a spin. So here goes!

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
                 "FDIVX", #Fidelity Diversified International Fund
                 "VEIEX", #Vanguard Emerging Markets Stock Index Fund
                 "VFISX", #Vanguard Short-Term Treasury Fund
                 "VBMFX", #Vanguard Total Bond Market Index Fund
                 "QRAAX", #Oppenheimer Commodity Strategy Total Return 
                 "VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

original <- FAAreturns(adPrices, stepCorRank=FALSE)
originalSWCbest <- FAAreturns(adPrices, stepCorRank=TRUE)
originalSWCdefault <- FAAreturns(adPrices, stepCorRank=TRUE, stepStartMethod="default")
stepMaxDecorBest <- FAAreturns(adPrices, weightMom=.05, weightVol=.025, 
                               weightCor=1, riskFreeName="VFISX", 
                               stepCorRank = TRUE, stepStartMethod="best")
stepMaxDecorDefault <- FAAreturns(adPrices, weightMom=.05, weightVol=.025, 
                                  weightCor=1, riskFreeName="VFISX", 
                                  stepCorRank = TRUE, stepStartMethod="default")
w311 <- FAAreturns(adPrices, weightMom=3, weightVol=1, weightCor=1, stepCorRank=TRUE)
originalMaxDecor <- FAAreturns(adPrices, weightMom=0, weightVol=1, stepCorRank=FALSE)
tmp <- cbind(original, originalSWCbest, originalSWCdefault, 
             stepMaxDecorBest, stepMaxDecorDefault, w311, originalMaxDecor)
names(tmp) <- c("original", "originalSWCbest", "originalSWCdefault", "SMDB", 
                "SMDD", "w311", "originalMaxDecor")
charts.PerformanceSummary(tmp, colorset=c("black", "orange", "blue", "purple", "green", "red", "darkred"))


statsTable <- data.frame(t(rbind(Return.annualized(tmp)*100, maxDrawdown(tmp)*100, SharpeRatio.annualized(tmp))))
statsTable$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]

Same seven securities as the original paper, with the following return streams:

Original: the FAA original replication
originalSWCbest: original weights, stepwise correlation algorithm, using the best security as ranked by momentum and volatility as a starting point.
originalSWCdefault: original weights, stepwise correlation algorithm, using the default (minimum sum of correlations) security as a starting point.
stepMaxDecorBest: a max decorrelation algorithm that sets the momentum and volatility weights at .05 and .025 respectively, compared to 1 for correlation, simply to get the best starting security through the first two factors.
stepMaxDecorDefault: analogous to originalSWCdefault, except with the starting security being defined as the one with minimum sum of correlations.
w311: using a weighting of 3, 1, and 1 on momentum, vol, and correlation, respectively, while using the stepwise correlation rank algorithm, starting with the best security (the default for the function), since I suspected that not weighing momentum at 1 or higher was the reason any other equity curves couldn’t top out above the paper’s original.
originalMaxDecor: max decorrelation using the original 1-pass correlation matrix

Here is the performance chart:

Here’s the way I interpret it:

Does David Varadi’s stepwise correlation ranking algorithm help performance? From this standpoint, the answers lead to yes. Using the original paper’s parameters, the performance over the paper’s backtest period is marginally better in terms of the equity curves. Comparing max decorrelation algorithms (SMDB and SMDD stand for stepwise max decorrelation best and default, respectively), the difference is even more clear.

However, I was wondering why I could never actually outdo the original paper’s annualized return, and out of interest, decided to more heavily weigh the momentum ranking than the original paper eventually had it set at. The result is a bumpier equity curve, but one that has a higher annualized return than any of the others. It’s also something that I didn’t try in my walk-forward example (though interested parties can simply modify the original momentum vector to contain a 1.5 weight, for instance).

Here’s the table of statistics for the different permutations:

> statsTable
                   Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    14.43802       13.15625                        1.489724            1.097427
originalSWCbest             14.70544       13.15625                        1.421045            1.117753
originalSWCdefault          14.68145       13.37059                        1.457418            1.098041
SMDB                        13.55656       12.33452                        1.410072            1.099075
SMDD                        13.18864       11.94587                        1.409608            1.104033
w311                        15.76213       13.85615                        1.398503            1.137555
originalMaxDecor            11.89159       11.68549                        1.434220            1.017637

At the end of the day, all of the permutations exhibit solid results, and fall along different ends of the risk/return curve. The original settings exhibit the highest Sharpe Ratio (barely), but not the highest annualized return to max drawdown ratio (which surprisingly, belongs to the setting that overweights momentum).

To wrap this analysis up (since there are other strategies that I wish to replicate), here is the out-of-sample performance of these seven strategies (to Oct 30, 2014):

Maybe not the greatest thing in the world considering the S&P has made some spectacular returns in 2013, but nevertheless, the momentum variant strategies established new equity highs fairly recently, and look to be on their way up from their latest slight drawdown. Here are the statistics for 2013-2014:

statsTable <- data.frame(t(rbind(Return.annualized(tmp["2013::"])*100, maxDrawdown(tmp["2013::"])*100, SharpeRatio.annualized(tmp["2013::"]))))
statsTable$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]

> statsTable
                   Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    9.284678       8.259298                       1.1966581           1.1241485
originalSWCbest             8.308246       9.657667                       0.9627413           0.8602746
originalSWCdefault          8.916144       8.985685                       1.0861781           0.9922609
SMDB                        6.406438       9.657667                       0.8366559           0.6633525
SMDD                        5.641980       5.979313                       0.7840507           0.9435833
w311                        8.921268       9.025100                       1.0142871           0.9884953
originalMaxDecor            2.888778       6.670709                       0.4244202           0.4330542

So, the original parameters are working solidly, the stepwise correlation algorithm seems to be in a slight rut, and the variants without any emphasis on momentum simply aren’t that great (they were created purely as illustrative tools to begin with). Whether you prefer to run FAA with these securities, or with trading strategies of your own, my only caveat is that transaction costs haven’t been taken into consideration (from what I hear, interactive brokers charges you $1 per transaction, so it shouldn’t make a world of a difference), but beyond that, I believe these last four posts have shown that FAA is something that works. While it doesn’t always work perfectly (EG the S&P 500 had a very good 2013), the logic is sound, and the results are solid, even given some rather plain-vanilla type securities.

In any case, I think I’ll conclude with the fact that FAA works, and the stepwise correlation algorithm provides a viable alternative to computing your weights. I’ll update my IKTrading package with some formal documentation regarding this algorithm soon.

Thanks for reading.

Introducing Stepwise Correlation Rank

So in the last post, I attempted to replicate the Flexible Asset Allocation paper. I’d like to offer a thanks to Pat of Intelligent Trading Tech (not updated recently, hopefully this will change) for helping me corroborate the results, so that I have more confidence there isn’t an error in my code.

One of the procedures the authors of the FAA paper used is a correlation rank, which I interpreted as the average correlation of each security to the others.

The issue, pointed out to me in a phone conversation I had with David Varadi is that when considering correlation, shouldn’t the correlations the investor is concerned about be between instruments within the portfolio, as opposed to simply all the correlations, including to instruments not in the portfolio? To that end, when selecting assets (or possibly features in general), conceptually, it makes more sense to select in a stepwise fashion–that is, start off at a subset of the correlation matrix, and then rank assets in order of their correlation to the heretofore selected assets, as opposed to all of them. This was explained in Mr. Varadi’s recent post.

Here’s a work in progress function I wrote to formally code this idea:

stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
  #edge cases
  if(dim(corMatrix)[1] == 1) {
    return(corMatrix)
  } else if (dim(corMatrix)[1] == 2) {
    ranks <- c(1.5, 1.5)
    names(ranks) <- colnames(corMatrix)
    return(ranks)
  }
  if(is.null(startNames)) {
    corSums <- rowSums(corMatrix)
    corRanks <- rank(corSums)
    startNames <- names(corRanks)[corRanks <= stepSize]
  }
  nameList <- list()
  nameList[[1]] <- startNames
  rankList <- list()
  rankCount <- 1
  rankList[[1]] <- rep(rankCount, length(startNames))
  rankedNames <- do.call(c, nameList)
  
  while(length(rankedNames) < nrow(corMatrix)) {
    rankCount <- rankCount+1
    subsetCor <- corMatrix[, rankedNames]
    if(class(subsetCor) != "numeric") {
      subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
      if(class(subsetCor) != "numeric") {
        corSums <- rowSums(subsetCor)
        corSumRank <- rank(corSums)
        lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
        nameList[[rankCount]] <- lowestCorNames
        rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
      } else { #1 name remaining
        nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
        rankList[[rankCount]] <- rankCount
      }
    } else {  #first iteration, subset on first name
      subsetCorRank <- rank(subsetCor)
      lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
      nameList[[rankCount]] <- lowestCorNames
      rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
    }    
    rankedNames <- do.call(c, nameList)
  }
  
  ranks <- do.call(c, rankList)
  names(ranks) <- rankedNames
  if(bestHighestRank) {
    ranks <- 1+length(ranks)-ranks
  }
  ranks <- ranks[colnames(corMatrix)] #return to original order
  return(ranks)
}

So the way the function works is that it takes in a correlation matrix, a starting name (if provided), and a step size (that is, how many assets to select per step, so that the process doesn’t become extremely long when dealing with larger amounts of assets/features). Then, it iterates–subset the correlation matrix on the starting name, and find the minimum value, and add it to a list of already-selected names. Next, subset the correlation matrix columns on the selected names, and the rows on the not selected names, and repeat, until all names have been accounted for. Due to R’s little habit of wiping out labels when a matrix becomes a vector, I had to write some special case code, which is the reason for two nested if/else statements (the first one being for the first column subset, and the second being for when there’s only one row remaining).

Also, if there’s an edge case (1 or 2 securities), then there is some functionality to handle those trivial cases.

Here’s a test script I wrote to test this function out:

require(PerformanceAnalytics)
require(quantmod)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

adRets <- Return.calculate(adPrices)

subset <- adRets["2012"]
corMat <- cor(subset)

tmp <- list()
for(i in 1:length(mutualFunds)) {
  rankRow <- stepwiseCorRank(corMat, startNames=mutualFunds[i])
  tmp[[i]] <- rankRow
}
rankDemo <- do.call(rbind, tmp)
rownames(rankDemo) <- mutualFunds
origRank <- rank(rowSums(corMat))
rankDemo <- rbind(rankDemo, origRank)
rownames(rankDemo)[8] <- "Original (VBMFX)"

heatmap(-rankDemo, Rowv=NA, Colv=NA, col=heat.colors(8), margins=c(6,6))

Essentially, using the 2012 year of returns for the 7 FAA mutual funds, I compared how different starting securities changed the correlation ranking sequence.

Here are the results:

               VTSMX FDIVX VEIEX VFISX VBMFX QRAAX VGSIX
VTSMX              1     6     7     4     2     3     5
FDIVX              6     1     7     4     2     5     3
VEIEX              6     7     1     4     2     3     5
VFISX              2     6     7     1     3     4     5
VBMFX              2     6     7     4     1     3     5
QRAAX              5     6     7     4     2     1     3
VGSIX              5     6     7     4     2     3     1
Non-Sequential     5     6     7     2     1     3     4

In short, the algorithm is rather robust to starting security selection, at least judging by this small example. However, comparing VBMFX start to the non-sequential ranking, we see that VFISX changes from rank 2 in the non-sequential to rank 4, with VTSMX going from rank 5 to rank 2. From an intuitive perspective, this makes sense, as both VBMFX and VFISX are bond funds, which have a low correlation with the other 5 equity-based mutual funds, but a higher correlation with each other, thus signifying that the algorithm seems to be working as intended, at least insofar as this small example demonstrates. Here’s a heatmap to demonstrate this in visual form.

The ranking order (starting security) is the vertical axis, and the horizontal are the ranks, from white being first, to red being last. Notice once again that the ranking orders are robust in general (consider each column of colors descending), but each particular ranking order is unique.

So far, this code still has to be tested in terms of its applications to portfolio management and asset allocation, but for those interested in such an idea, it’s my hope that this provides a good reference point.

Thanks for reading.