# Principal Component Momentum?

This post will investigate using Principal Components as part of a momentum strategy.

Recently, I ran across a post from David Varadi that I thought I’d further investigate and translate into code I can explicitly display (as David Varadi doesn’t). Of course, as David Varadi is a quantitative research director with whom I’ve done good work with in the past, I find that trying to investigate his ideas is worth the time spent.

So, here’s the basic idea: in an allegedly balanced universe, containing both aggressive (e.g. equity asset class ETFs) assets and defensive assets (e.g. fixed income asset class ETFs), that principal component analysis, a cornerstone in machine learning, should have some effectiveness at creating an effective portfolio.

I decided to put that idea to the test with the following algorithm:

Using the same assets that David Varadi does, I first use a rolling window (between 6-18 months) to create principal components. Making sure that the SPY half of the loadings is always positive (that is, if the loading for SPY is negative, multiply the first PC by -1, as that’s the PC we use), and then create two portfolios–one that’s comprised of the normalized positive weights of the first PC, and one that’s comprised of the negative half.

Next, every month, I use some momentum lookback period (1, 3, 6, 10, and 12 months), and invest in the portfolio that performed best over that period for the next month, and repeat.

Here’s the source code to do that: (and for those who have difficulty following, I highly recommend James Picerno’s Quantitative Investment Portfolio Analytics in R book.

```require(PerformanceAnalytics)
require(quantmod)
require(stats)
require(xts)

symbols <- c("SPY", "EFA", "EEM", "DBC", "HYG", "GLD", "IEF", "TLT")

# get free data from yahoo
rets <- list()
getSymbols(symbols, src = 'yahoo', from = '1990-12-31')
for(i in 1:length(symbols)) {
colnames(returns) <- symbols[i]
rets[[i]] <- returns
}
rets <- na.omit(do.call(cbind, rets))

# 12 month PC rolling PC window, 3 month momentum window
pcPlusMinus <- function(rets, pcWindow = 12, momWindow = 3) {
ep <- endpoints(rets)

wtsPc1Plus <- NULL
wtsPc1Minus <- NULL

for(i in 1:(length(ep)-pcWindow)) {
# get subset of returns
returnSubset <- rets[(ep[i]+1):(ep[i+pcWindow])]

# perform PCA, get first PC (I.E. pc1)
pcs <- prcomp(returnSubset)
firstPc <- pcs[][,1]

if(firstPc['SPY'] < 0) {
firstPc <- firstPc * -1
}

# create vector for negative values of pc1
wtsMinus <- firstPc * -1
wtsMinus[wtsMinus < 0] <- 0
wtsMinus <- wtsMinus/(sum(wtsMinus)+1e-16) # in case zero weights
wtsMinus <- xts(t(wtsMinus), order.by=last(index(returnSubset)))
wtsPc1Minus[[i]] <- wtsMinus

# create weight vector for positive values of pc1
wtsPlus <- firstPc
wtsPlus[wtsPlus < 0] <- 0
wtsPlus <- wtsPlus/(sum(wtsPlus)+1e-16)
wtsPlus <- xts(t(wtsPlus), order.by=last(index(returnSubset)))
wtsPc1Plus[[i]] <- wtsPlus
}

# combine positive and negative PC1 weights
wtsPc1Minus <- do.call(rbind, wtsPc1Minus)
wtsPc1Plus <- do.call(rbind, wtsPc1Plus)

# get return of PC portfolios
pc1MinusRets <- Return.portfolio(R = rets, weights = wtsPc1Minus)
pc1PlusRets <- Return.portfolio(R = rets, weights = wtsPc1Plus)

# combine them
combine <-na.omit(cbind(pc1PlusRets, pc1MinusRets))
colnames(combine) <- c("PCplus", "PCminus")

momEp <- endpoints(combine)
momWts <- NULL
for(i in 1:(length(momEp)-momWindow)){
momSubset <- combine[(momEp[i]+1):(momEp[i+momWindow])]
momentums <- Return.cumulative(momSubset)
momWts[[i]] <- xts(momentums==max(momentums), order.by=last(index(momSubset)))
}
momWts <- do.call(rbind, momWts)

out <- Return.portfolio(R = combine, weights = momWts)
colnames(out) <- paste("PCwin", pcWindow, "MomWin", momWindow, sep="_")
return(list(out, wtsPc1Minus, wtsPc1Plus, combine))
}

pcWindows <- c(6, 9, 12, 15, 18)
momWindows <- c(1, 3, 6, 10, 12)

permutes <- expand.grid(pcWindows, momWindows)

stratStats <- function(rets) {
stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
stats[5,] <- stats[1,]/stats[4,]
stats[6,] <- stats[1,]/UlcerIndex(rets)
rownames(stats) <- "Worst Drawdown"
rownames(stats) <- "Calmar Ratio"
rownames(stats) <- "Ulcer Performance Index"
return(stats)
}

results <- NULL
for(i in 1:nrow(permutes)) {
tmp <- pcPlusMinus(rets = rets, pcWindow = permutes\$Var1[i], momWindow = permutes\$Var2[i])
results[[i]] <- tmp[]
}
results <- do.call(cbind, results)
stats <- stratStats(results)
```

After a cursory look at the results, it seems the performance is fairly miserable with my implementation, even by the standards of tactical asset allocation models (the good ones have a Calmar and Sharpe Ratio above 1)

Here are histograms of the Calmar and Sharpe ratios.  These values are generally too low for my liking. Here’s a screenshot of the table of all 25 results. While my strategy of choosing which portfolio to hold is different from David Varadi’s (momentum instead of whether or not the aggressive portfolio is above its 200-day moving average), there are numerous studies that show these two methods are closely related, yet the results feel starkly different (and worse) compared to his site.

I’d certainly be willing to entertain suggestions as to how to improve the process, which will hopefully create some more meaningful results. I also know that AllocateSmartly expressed interest in implementing something along these lines for their estimable library of TAA strategies, so I thought I’d try to do it and see what results I’d find, which in this case, aren’t too promising.

NOTE: I am networking, and actively seeking a position related to my skill set in either Philadelphia, New York City, or remotely. If you know of a position which may benefit from my skill set, feel free to let me know. You can reach me on my LinkedIn profile here, or email me.

# A Closer Update To David Varadi’s Percentile Channels Strategy

So thanks to seeing Michael Kapler’s implementation of David Varadi’s percentile channels strategy, I was able to get a better understanding of what was going on. It turns out that rather than looking at the channel value only at the ends of months, that the strategy actually keeps track of the channel’s value intra-month. So if in the middle of the month, you had a sell signal and at the end of the month, the price moved up to intra-channel values, you would still be on a sell signal rather than the previous month’s end-of-month signal. It’s not much different than my previous implementation when all is said and done (slightly higher Sharpe, slightly lower returns and drawdowns). In any case, the concept remains the same.

For this implementation, I’m going to use the runquantile function from the caTools package, which contains a function called runquantile that works like a generalized runMedian/runMin/runMax from TTR, once you’re able to give it the proper arguments (on default, its results are questionable).

Here’s the code:

```require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]

require(caTools)
pctChannelPosition <- function(prices,
dayLookback = 60,
lowerPct = .25, upperPct = .75) {

upperChannels <- runquantile(prices, k=dayLookback, probs=upperPct, endrule="trim")

lowerChannels <- runquantile(prices, k=dayLookback, probs=lowerPct, endrule="trim")

positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
positions[prices > upperQ & lag(prices) < upperQ] <- 1 #cross up
positions[prices < lowerQ & lag(prices) > lowerQ] <- -1 #cross down
positions <- na.locf(positions)
positions[is.na(positions)] <- 0

colnames(positions) <- colnames(prices)
return(positions)
}

#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4

compositeMonths <- compositePosition[endpoints(compositePosition, on="months"),]

returns <- Return.calculate(prices)
monthlySD20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(prices))[index(compositeMonths),]
weight <- compositeMonths*1/monthlySD20
weight <- abs(weight)/rowSums(abs(weight))
weight[compositeMonths < 0 | is.na(weight)] <- 0
weight\$CASH <- 1-rowSums(weight)

#not actually equal weight--more like composite weight, going with
#Michael Kapler's terminology here
ewWeight <- abs(compositeMonths)/rowSums(abs(compositeMonths))
ewWeight[compositeMonths < 0 | is.na(ewWeight)] <- 0
ewWeight\$CASH <- 1-rowSums(ewWeight)

rpRets <- Return.portfolio(R = returns, weights = weight)
ewRets <- Return.portfolio(R = returns, weights = ewWeight)
```

Essentially, with runquantile, you need to give it the “trim” argument, and then manually append the leading NAs, and then manually turn it into an xts object, which is annoying. One would think that the author of this package would take care of these quality-of-life issues, but no. In any case, there are two strategies at play here–one being the percentile channel risk parity strategy, and the other what Michael Kapler calls “channel equal weight”, which actually *isn’t* an equal weight strategy, since the composite parameter values may take the values (-1, -.5, 0, .5, and 1–with a possibility for .75 or .25 early on when some of the lookback channels still say 0 instead of only 1 or -1), but simply, the weights without taking into account volatility at all, but I’m sticking with Michael Kapler’s terminology to be consistent. That said, I don’t personally use Michael Kapler’s SIT package due to the vast differences in syntax between it and the usual R code I’m used to. However, your mileage may vary.

In any case, here’s the updated performance:

```both <- cbind(rpRets, ewRets)
colnames(both) <- c("RiskParity", "Equal Weight")
charts.PerformanceSummary(both)
rbind(table.AnnualizedReturns(both), maxDrawdown(both))
apply.yearly(both, Return.cumulative)
```

Which gives us the following output:

```> rbind(table.AnnualizedReturns(both), maxDrawdown(both))
RiskParity Equal Weight
Annualized Return         0.09380000    0.1021000
Annualized Std Dev        0.06320000    0.0851000
Annualized Sharpe (Rf=0%) 1.48430000    1.1989000
Worst Drawdown            0.06894391    0.1150246

> apply.yearly(both, Return.cumulative)
RiskParity Equal Weight
2006-12-29 0.08352255   0.07678321
2007-12-31 0.05412147   0.06475540
2008-12-31 0.10663085   0.12212063
2009-12-31 0.11920721   0.19093131
2010-12-31 0.13756460   0.14594317
2011-12-30 0.11744706   0.08707801
2012-12-31 0.07730896   0.06085295
2013-12-31 0.06733187   0.08174173
2014-12-31 0.06435030   0.07357458
2015-02-17 0.01428705   0.01568372
```

In short, the more naive weighting scheme delivers slightly higher returns but pays dearly for those marginal returns with downside risk.

Here are the equity curves: So, there you have it. The results David Varadi obtained are legitimate. But nevertheless, I hope this demonstrates how easy it is for the small details to make material differences.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

# An Attempt At Replicating David Varadi’s Percentile Channels Strategy

This post will detail an attempt at replicating David Varadi’s percentile channels strategy. As I’m only able to obtain data back to mid 2006, the exact statistics will not be identical. However, of the performance I do have, it is similar (but not identical) to the corresponding performance presented by David Varadi.

First off, before beginning this post, I’d like to issue a small mea culpa regarding the last post. It turns out that Yahoo’s data, once it gets into single digit dollar prices, is of questionable accuracy, and thus, results from the late 90s on mutual funds with prices falling into those ranges are questionable, as a result. As I am an independent blogger, and also make it a policy of readers being able to replicate all of my analysis, I am constrained by free data sources, and sometimes, the questionable quality of that data may materially affect results. So, if it’s one of your strategies replicated on this blog, and you find contention with my results, I would be more than happy to work with the data used to generate the original results, corroborate the results, and be certain that any differences in results from using lower-quality, publicly-available data stem from that alone. Generally, I find it surprising that a company as large as Yahoo can have such gaping data quality issues in certain aspects, but I’m happy that I was able to replicate the general thrust of QTS very closely.

This replication of David Varadi’s strategy, however, is not one such case–mainly because the data for DBC does not extend back very far (it was in inception only in 2006, and the data used by David Varadi’s programmer was obtained from Bloomberg, which I have no access to), and furthermore, I’m not certain if my methods are absolutely identical. Nevertheless, the strategy in and of itself is solid.

The way the strategy works is like this (to my interpretation of David Varadi’s post and communication with his other programmer). Given four securities (LQD, DBC, VTI, ICF), and a cash security (SHY), do the following:

Find the running the n-day quantile of an upper and lower percentile. Anything above the upper percentile gets a score of 1, anything lower gets a score of -1. Leave the rest as NA (that is, anything between the bounds).

Subset these quantities on their monthly endpoints. Any value between channels (NA) takes the quantity of the last value. (In short, na.locf). Any initial NAs become zero.

Do this with a 60-day, 120-day, 180-day, and 252-day setting at 25th and 75th percentiles. Add these four tables up (their dimensions are the number of monthly endpoints by the number of securities) and divide by the number of parameter settings (in this case, 4 for 60, 120, 180, 252) to obtain a composite position.

Next, obtain a running 20-day standard deviation of the returns (not prices!), and subset it for the same indices as the composite positions. Take the inverse of these volatility scores, and multiply it by the composite positions to get an inverse volatility position. Take its absolute value (some positions may be negative, remember), and normalize. In the beginning, there may be some zero-across-all-assets positions, or other NAs due to lack of data (EG if a monthly endpoint occurs before enough data to compute a 20-day standard deviation, there will be a row of NAs), which will be dealt with. Keep all positions with a positive composite position (that is, scores of .5 or 1, discard all scores of zero or lower), and reinvest the remainder into the cash asset (SHY, in our case). Those are the final positions used to generate the returns.

This is how it looks like in code.

This is the code for obtaining the data (from Yahoo finance) and separating it into cash and non-cash data.

```require(quantmod)
require(caTools)
require(PerformanceAnalytics)
require(TTR)
getSymbols(c("LQD", "DBC", "VTI", "ICF", "SHY"), from="1990-01-01")

prices <- prices[!is.na(prices[,2]),]
returns <- Return.calculate(prices)
cashPrices <- prices[, 5]
assetPrices <- prices[, -5]
```

This is the function for computing the percentile channel positions for a given parameter setting. Unfortunately, it is not instantaneous due to R’s rollapply function paying a price in speed for generality. While the package caTools has a runquantile function, as of the time of this writing, I have found differences between its output and runMedian in TTR, so I’ll have to get in touch with the package’s author.

```pctChannelPosition <- function(prices, rebal_on=c("months", "quarters"),
dayLookback = 60,
lowerPct = .25, upperPct = .75) {

upperQ <- rollapply(prices, width=dayLookback, quantile, probs=upperPct)
lowerQ <- rollapply(prices, width=dayLookback, quantile, probs=lowerPct)
positions <- xts(matrix(nrow=nrow(prices), ncol=ncol(prices), NA), order.by=index(prices))
positions[prices > upperQ] <- 1
positions[prices < lowerQ] <- -1

ep <- endpoints(positions, on = rebal_on)
positions <- positions[ep,]
positions <- na.locf(positions)
positions[is.na(positions)] <- 0

colnames(positions) <- colnames(prices)
return(positions)
}
```

The way this function works is simple: computes a running quantile using rollapply, and then scores anything with price above its 75th percentile as 1, and anything below the 25th percentile as -1, in accordance with David Varadi’s post.

It then subsets these quantities on months (quarters is also possible–or for that matter, other values, but the spirit of the strategy seems to be months or quarters), and imputes any NAs with the last known observation, or zero, if it is an initial NA before any position is found. Something I have found over the course of writing this and the QTS strategy is that one need not bother implementing a looping mechanism to allocate positions monthly if there isn’t a correlation matrix based on daily data involved every month, and it makes the code more readable.

Next, we find our composite position.

```#find our positions, add them up
d60 <- pctChannelPosition(assetPrices)
d120 <- pctChannelPosition(assetPrices, dayLookback = 120)
d180 <- pctChannelPosition(assetPrices, dayLookback = 180)
d252 <- pctChannelPosition(assetPrices, dayLookback = 252)
compositePosition <- (d60 + d120 + d180 + d252)/4
```

Next, find the running volatility for the assets, and subset them to the same time period (in this case months) as our composite position. In David Varadi’s example, the parameter is a 20-day lookback.

```#find 20-day rolling standard deviations, subset them on identical indices
#to the percentile channel monthly positions
sd20 <- xts(sapply(returns[,-5], runSD, n=20), order.by=index(assetPrices))
monthlySDs <- sd20[index(compositePosition)]
```

Next, perform the following steps: find the inverse volatility of these quantities, multiply by the composite position score, take the absolute value, and keep any position for which the composite position is greater than zero (or technically speaking, has positive signage). Due to some initial NA rows due to a lack of data (either not enough days to compute a running volatility, or no positive positions yet), those will simply be imputed to zero. Reinvest the remainder in cash.

```#compute inverse volatilities
inverseVols <- 1/monthlySDs

#multiply inverse volatilities by composite positions
invVolPos <- inverseVols*compositePosition

#take absolute values of inverse volatility multiplied by position
absInvVolPos <- abs(invVolPos)

#normalize the above quantities
normalizedAbsInvVols <- absInvVolPos/rowSums(absInvVolPos)

#keep only positions with positive composite positions (remove zeroes/negative)
nonCashPos <- normalizedAbsInvVols * sign(compositePosition > 0)
nonCashPos[is.na(nonCashPos)] <- 0 #no positions before we have enough data

#add cash position which is complement of non-cash position
finalPos <- nonCashPos
finalPos\$cashPos <- 1-rowSums(finalPos)
```

And finally, the punchline, how does this strategy perform?

```#compute returns
stratRets <- Return.portfolio(R = returns, weights = finalPos)
charts.PerformanceSummary(stratRets)
stats <- rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
rownames(stats) <- "Worst Drawdown"
stats
```

Like this:

```> stats
portfolio.returns
Annualized Return                0.10070000
Annualized Std Dev               0.06880000
Annualized Sharpe (Rf=0%)        1.46530000
Worst Drawdown                   0.07449537
```

With the following equity curve: The statistics are visibly worse than David Varadi’s 10% vs. 11.1% CAGR, 6.9% annualized standard deviation vs. 5.72%, 7.45% max drawdown vs. 5.5%, and derived statistics (EG MAR). However, my data starts far later, and 1995-1996 seemed to be phenomenal for this strategy. Here are the cumulative returns for the data I have:

```> apply.yearly(stratRets, Return.cumulative)
portfolio.returns
2006-12-29        0.11155069
2007-12-31        0.07574266
2008-12-31        0.16921233
2009-12-31        0.14600008
2010-12-31        0.12996371
2011-12-30        0.06092018
2012-12-31        0.07306617
2013-12-31        0.06303612
2014-12-31        0.05967415
2015-02-13        0.01715446
```

I see a major discrepancy between my returns and David’s returns in 2011, but beyond that, the results seem to be somewhere close in the pattern of yearly returns. Whether my methodology is incorrect (I think I followed the procedure to the best of my understanding, but of course, if someone sees a mistake in my code, please let me know), or whether it’s the result of using Yahoo’s questionable quality data, I am uncertain.

However, in my opinion, that doesn’t take away from the validity of the strategy as a whole. With a mid-1 Sharpe ratio on a monthly rebalancing scale, and steady new equity highs, I feel that this is a result worth sharing–even if not directly corroborated (yet, hopefully).

One last note–some of the readers on David Varadi’s blog have cried foul due to their inability to come close to his results. Since I’ve come close, I feel that the results are valid, and since I’m using different data, my results are not identical. However, if anyone has questions about my process, feel free to leave questions and/or comments.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

# Comparing Flexible and Elastic Asset Allocation

So recently, I tried to combine Flexible and Elastic Asset Allocation. The operative word being–tried. Essentially, I saw Flexible Asset Allocation as an incomplete algorithm — namely that although it was an excellent method for selecting securities, that there had to have been a better way to weigh stocks than a naive equal-weight scheme.

It turns out, the methods outlined in Elastic Asset Allocation weren’t doing the trick (that is, a four month cumulative return raised to the return weight multiplied by the correlation to a daily-rebalanced equal-weight index of the selected securities with cumulative return greater than zero). Either I managed a marginally higher return at the cost of much higher volatility and protracted drawdown, or I maintained my Sharpe ratio at the cost of much lower returns. Thus, I scrapped all of it, which was a shame as I was hoping to be able to combine the two methodologies into one system that would extend the research I previously blogged on. Instead, after scrapping it, I decided to have a look as to why I was running into the issues I was.

In any case, here’s the quick demo I did.

```require(quantmod)
require(PerformanceAnalytics)

symbols <- c("VTSMX", "FDIVX", "VEIEX", "VBMFX", "VFISX", "VGSIX", "QRAAX")

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))
ep <- endpoints(prices, "months")
prices <- prices[ep,]
prices <- prices["1997-03::"]

eaaOffensive <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset = "VBMFX", bestN = 3)
eaaOffNoCrash <- EAA(monthlyPrices = prices, returnWeights = TRUE, cashAsset ="VBMFX",
bestN = 3, enableCrashProtection = FALSE)
faa <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = TRUE)
faaNoStepwise <- FAA(prices = adPrices, riskFreeName = "VBMFX", bestN = 3, returnWeights = TRUE, stepCorRank = FALSE)

eaaOffDaily <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffensive[])
eaaOffNoCrash <- Return.portfolio(R = Return.calculate(adPrices), weights = eaaOffNoCrash[])

comparison <- cbind(eaaOffDaily, eaaOffNoCrash, faa[], faaNoStepwise[])
colnames(comparison) <- c("Offensive EAA", "Offensive EAA (no crash protection)", "FAA (stepwise)", "FAA (no stepwise)")
charts.PerformanceSummary(comparison)

rbind(table.AnnualizedReturns(comparison), maxDrawdown(comparison))
```

Essentially, I compared FAA with the stepwise correlation rank algorithm, without it, and the offensive EAA with and without crash protection. The results were disappointing.

Here are the equity curves: In short, the best default FAA variant handily outperforms any of the EAA variants.

And here are the statistics:

```                          Offensive EAA Offensive EAA (no crash protection) FAA (stepwise) FAA (no stepwise)
Annualized Return             0.1247000                           0.1305000      0.1380000          0.131400
Annualized Std Dev            0.1225000                           0.1446000      0.0967000          0.089500
Annualized Sharpe (Rf=0%)     1.0187000                           0.9021000      1.4271000          1.467900
Worst Drawdown                0.1581859                           0.2696754      0.1376124          0.130865
```

Note of warning: if you run EAA, it seems it’s unwise to do it without crash protection (aka decreasing your stake in everything but the cash/risk free asset by a proportion of the number of negative return securities). I didn’t include the defensive variant of EAA since that gives markedly lower returns.

Not that this should discredit EAA, but on a whole, I feel that there should probably be a way to beat the (usually) equal-weight weighting scheme (sometimes the cash asset gets a larger value due to a negative momentum asset making it into the top assets by virtue of the rank of its volatility and correlation, and ends up getting zeroed out) that FAA employs, and that treating FAA as an asset selection mechanism as opposed to a weighting mechanism may yield some value. However, I have not yet found it myself.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

# An Update on Flexible Asset Allocation

A few weeks back, after seeing my replication, one of the original authors of the Flexible Asset Allocation paper got in touch with me to tell me to make a slight adjustment to the code, in that rather than remove any negative-momentum securities before performing any ranking, to perform all ranking without taking absolute momentum into account, and only removing negative absolute momentum securities at the very end, after allocating weights.

Here’s the new code:

```FAA <- function(prices, monthLookback = 4,
weightMom = 1, weightVol = .5, weightCor = .5,
riskFreeName = NULL, bestN = 3,
stepCorRank = FALSE, stepStartMethod = c("best", "default"),
geometric = TRUE, ...) {
stepStartMethod <- stepStartMethod
if(is.null(riskFreeName)) {
prices\$zeroes <- 0
riskFreeName <- "zeroes"
warning("No risk-free security specified. Recommended to use one of: quandClean('CHRIS/CME_US'), SHY, or VFISX.
}
returns <- Return.calculate(prices)
monthlyEps <- endpoints(prices, on = "months")
riskFreeCol <- grep(riskFreeName, colnames(prices))
tmp <- list()
dates <- list()

for(i in 2:(length(monthlyEps) - monthLookback)) {
#subset data
priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]

#perform computations
momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
momentum <- momentum[,!is.na(momentum)]
#momentum[is.na(momentum)] <- -1 #set any NA momentum to negative 1 to keep R from crashing
priceData <- priceData[,names(momentum)]
returnsData <- returnsData[,names(momentum)]

momRank <- rank(momentum)
vols <- data.frame(StdDev(returnsData))
volRank <- rank(-vols)
cors <- cor(returnsData, use = "complete.obs")
if (stepCorRank) {
if(stepStartMethod=="best") {
compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
corRank <- stepwiseCorRank(corMatrix=cors, startNames = names(maxRank),
bestHighestRank = TRUE, ...)

} else {
corRank <- stepwiseCorRank(corMatrix=cors, bestHighestRank=TRUE, ...)
}
} else {
corRank <- rank(-rowSums(cors))
}

totalRank <- rank(weightMom*momRank + weightVol*volRank + weightCor*corRank)

upper <- length(names(returnsData))
lower <- max(upper-bestN+1, 1)
topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]

#compute weights
longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
longs[momentum < 0] <- 0 #in previous algorithm, removed momentums < 0, this time, we zero them out at the end.
longs <- longs/sum(longs) #equal weight all candidates
longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
names(longs) <- names(totalRank)

#append removed names (those with momentum < 0)
removedZeroes <- rep(0, ncol(returns)-length(longs))
names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
longs <- c(longs, removedZeroes)

#reorder to be in the same column order as original returns/prices
longs <- data.frame(t(longs))
longs <- longs[, names(returns)]

#append lists
tmp[[i]] <- longs
dates[[i]] <- index(returnsData)[nrow(returnsData)]
}

weights <- do.call(rbind, tmp)
dates <- do.call(c, dates)
weights <- xts(weights, order.by=as.Date(dates))
weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = geometric)
colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
return(strategyReturns)
}
```

And here are the new results, both with the original configuration, and using the stepwise correlation ranking algorithm introduced by David Varadi:

```mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
"FDIVX", #Fidelity Diversified International Fund
"VEIEX", #Vanguard Emerging Markets Stock Index Fund
"VFISX", #Vanguard Short-Term Treasury Fund
"VBMFX", #Vanguard Total Bond Market Index Fund
"QRAAX", #Oppenheimer Commodity Strategy Total Return
"VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2014-10-30")
tmp <- list()
for(fund in mutualFunds) {
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)

swc <- FAA(adPrices, riskFreeName="VFISX", stepCorRank = TRUE)
all4 <- cbind(original, swc, originalOld, swcOld)
names(all4) <- c("original", "swc", "origOld", "swcOld")
charts.PerformanceSummary(all4)
```
```> rbind(Return.annualized(all4)*100,
+       maxDrawdown(all4)*100,
+       SharpeRatio.annualized(all4))
original       swc   origOld    swcOld
Annualized Return               12.795205 14.135997 13.221775 14.037137
Worst Drawdown                  11.361801 11.361801 13.082294 13.082294
Annualized Sharpe Ratio (Rf=0%)  1.455302  1.472924  1.377914  1.390025
```

And the resulting equity curve comparison Overall, it seems filtering on absolute momentum after applying all weightings using only relative momentum to rank actually improves downside risk profiles ever so slightly compared to removing negative momentum securities ahead of time. In any case, FAAreturns will be the function that removes negative momentum securities ahead of time, and FAA will be the ones that removes them after all else is said and done.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

# Predicting High Yield with SPY–a Two Part Post

This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some volatility trading strategies (the extent of which I hope will end up as a workable trading strategy–my current replica of some of VolatilityMadeSimple’s publicly displayed “example” strategies (note, from other blogs, not to be confused with their proprietary strategy) are something that I think is too risky to be traded as-is), and Cesar Alvarez, of Alvarez Quant Trading. If his name sounds familiar to some of you, that’s because it should. He used to collaborate (still does?) with Larry Connors of TradingMarkets.com, and I’m pretty sure that sometime in the future, I’ll cover those strategies as well.

The strategy for this post is simple, and taken from this post from CSS Analytics.

Pretty straightforward–compute a 20-day SMA on the SPY (I use unadjusted since that’s what the data would have actually been). When the SPY’s close crosses above the 20-day SMA, buy the high-yield bond index, either VWEHX or HYG, and when the converse happens, move to the cash-substitute security, either VFISX or SHY.

Now, while the above paragraph may make it seem that VWEHX and HYG are perfect substitutes, well, they aren’t, as no two instruments are exactly alike, which, as could be noted from my last post, is a detail that one should be mindful of. Even creating a synthetic “equivalent” is never exactly perfect. Even though I try my best to iron out such issues, over the course of generally illustrating an idea, the numbers won’t line up exactly (though hopefully, they’ll be close). In any case, it’s best to leave an asterisk whenever one is forced to use synthetics for the sake of a prolonged backtest.

The other elephant/gorilla in the room (depending on your preference for metaphorical animals), is whether or not to use adjusted data. The upside to that is that dividends are taken into account. The *downside* to that is that the data isn’t the real data, and also assumes a continuous reinvestment of dividends. Unfortunately, shares of a security are not continuous quantities–they are discrete quantities made so by their unit price, so the implicit assumptions in adjusted prices can be optimistic.

For this particular topic, Cesar Alvarez covered it exceptionally well on his blog post, and I highly recommend readers give that post a read, in addition to following his blog in general. However, just to illustrate the effect, let’s jump into the script.

```
getSymbols("VWEHX", from="1950-01-01")
getSymbols("SPY", from="1900-01-01")
getSymbols("HYG", from="1990-01-01")
getSymbols("SHY", from="1990-01-01")
getSymbols("VFISX", from="1990-01-01")

spySma20Cl <- SMA(Cl(SPY), n=20)
clSig <- Cl(SPY) > spySma20Cl
clSig <- lag(clSig, 1)

vwehxCloseRets <- Return.calculate(Cl(VWEHX))
vfisxCloseRets <- Return.calculate(Cl(VFISX))

hygCloseRets <- Return.calculate(Cl(HYG))
shyCloseRets <- Return.calculate(Cl(SHY))

mutualClRets <- vwehxCloseRets*clSig + vfisxCloseRets*(1-clSig)

etfClRets <- hygCloseRets*clSig + shyCloseRets*(1-clSig)

```

Here are the results:

```
charts.PerformanceSummary(mutualFundBacktest)
data.frame(t(rbind(Return.annualized(mutualFundBacktest)*100,
maxDrawdown(mutualFundBacktest)*100,
SharpeRatio.annualized(mutualFundBacktest))))

```

Which produces the following equity curves: As can be seen, the choice to adjust or not can be pretty enormous. Here are the corresponding three statistics:

```               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Close             7.794086       4.637520                        3.034225
```

Even without the adjustment, the strategy itself is…very very good, at least from this angle. Let’s look at the ETF variant now.

```etfBacktest <- merge(etfAdRets, etfClRets, join='inner')
charts.PerformanceSummary(etfBacktest)
data.frame(t(rbind(Return.annualized(etfBacktest)*100,
maxDrawdown(etfBacktest)*100,
SharpeRatio.annualized(etfBacktest))))
```

The resultant equity curve: With the corresponding statistics:

```             Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
HYG.Close             5.530951       9.454754                       0.6840059
```

Again, another stark difference. Let’s combine all four variants.

```fundsAndETFs <- merge(mutualFundBacktest, etfBacktest, join='inner')
charts.PerformanceSummary(fundsAndETFs)
data.frame(t(rbind(Return.annualized(fundsAndETFs)*100,
maxDrawdown(fundsAndETFs)*100,
SharpeRatio.annualized(fundsAndETFs))))
```

The equity curve: With the resulting statistics:

```               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Close            11.739849       3.169040                       3.8715923
HYG.Close               5.530951       9.454754                       0.6840059
```

In short, while the strategy itself seems strong, the particular similar (but not identical) instruments used to implement the strategy make a large difference. So, when backtesting, make sure to understand what taking liberties with the data means. In this case, by turning two levers, the Sharpe Ratio varied from less than 1 to above 4.

Next, I’d like to demonstrate a little trick in quantstrat. Although plenty of examples of trading strategies only derive indicators (along with signals and rules) from the market data itself, there are also many strategies that incorporate data from outside simply the price action of the particular security at hand. Such examples would be many SPY strategies that incorporate VIX information, or off-instrument signal strategies like this one.

The way to incorporate off-instrument information into quantstrat simply requires understanding what the mktdata object is, which is nothing more than an xts type object. By default, a security may originally have just the OHLCV and open interest columns. Most demos in the public space generally use data only from the instruments themselves. However, it is very much possible to actually pre-compute signals.

Here’s a continuation of the script to demonstrate, with a demo of the unadjusted HYG leg of this trade:

```####### BOILERPLATE FROM HERE
require(quantstrat)

currency('USD')
Sys.setenv(TZ="UTC")
symbols <- "HYG"
stock(symbols, currency="USD", multiplier=1)
initDate="1990-01-01"

strategy.st <- portfolio.st <- account.st <- "preCalc"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD')
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)
######### TO HERE

clSig <- Cl(SPY) > SMA(Cl(SPY), n=20)

HYG <- merge(HYG, clSig, join='inner')
names(HYG) <- "precomputed_signal"

#no parameters or indicators--we precalculated our signal

arguments=list(column="precomputed_signal", threshold=.5,
relationship="gt", cross=TRUE),
label="longEntry")

arguments=list(column="precomputed_signal", threshold=.5,
relationship="lt", cross=TRUE),
label="longExit")

arguments=list(sigcol="longEntry", sigval=TRUE, orderqty=1, ordertype="market",
orderside="long", replace=FALSE, prefer="Open"),
type="exit", path.dep=TRUE)

arguments=list(sigcol="longExit", sigval=TRUE, orderqty="all", ordertype="market",
orderside="long", replace=FALSE, prefer="Open"),
type="exit", path.dep=TRUE)

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)

#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)\$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)
```

As you can see, no indicators computed from the actual market data, because the strategy used a pre-computed value to work off of. The lowest-hanging fruit of applying this methodology, of course, would be to append the VIX index as an indicator for trading strategies on the SPY.

And here are the results, trading a unit quantity:

```> data.frame(round(t(tradeStats(portfolio.st)[-c(1,2)]),2))
HYG
Num.Txns           217.00
Largest.Winner       9.83
Largest.Loser       -2.71
Gross.Profits       67.07
Gross.Losses       -29.87
Percent.Positive    50.00
Percent.Negative    50.00
Profit.Factor        2.25
Avg.Daily.PL         0.35
Med.Daily.PL         0.01
Std.Dev.Daily.PL     1.67
Ann.Sharpe           3.33
Max.Drawdown        -7.24
Profit.To.Max.Draw   5.08
Avg.WinLoss.Ratio    2.25
Med.WinLoss.Ratio    1.67
Max.Equity          43.78
Min.Equity          -1.88
End.Equity          36.76
```

And the corresponding position chart: Lastly, here are the vanguard links for VWEHX and VFISX. Apparently, neither charge a redemption fee. I’m not sure if this means that they can be freely traded in a systematic fashion, however.

In conclusion, hopefully this post showed a potentially viable strategy, understanding the nature of the data you’re working with, and how to pre-compute values in quantstrat.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

# Combining FAA and Stepwise Correlation

Since I debuted the stepwise correlation algorithm, I suppose the punchline that people want to see is: does it actually work?

The short answer? Yes, it does.

A slightly longer answer: it works, With the caveat that having a better correlation algorithm that makes up 25% of the total sum of weighted ranks only has a marginal (but nevertheless positive) effect on returns. Furthermore, when comparing a max decorrelation weighting using default single-pass correlation vs. stepwise, the stepwise gives a bumpier ride, but one with visibly larger returns. Furthermore, for this universe, the difference between starting at the security ranked highest by the momentum and volatility components, or with the security that has the smallest aggregate correlation to all securities, is very small. Essentially, from my inspection, the answer to using stepwise correlation is: “it’s a perfectly viable alternative, if not better.”

Here are the functions used in the script:

```require(quantmod)
require(PerformanceAnalytics)

stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
#edge cases
if(dim(corMatrix) == 1) {
return(corMatrix)
} else if (dim(corMatrix) == 2) {
ranks <- c(1.5, 1.5)
names(ranks) <- colnames(corMatrix)
return(ranks)
}

if(is.null(startNames)) {
corSums <- rowSums(corMatrix)
corRanks <- rank(corSums)
startNames <- names(corRanks)[corRanks <= stepSize]
}
nameList <- list()
nameList[] <- startNames
rankList <- list()
rankCount <- 1
rankList[] <- rep(rankCount, length(startNames))
rankedNames <- do.call(c, nameList)

while(length(rankedNames) < nrow(corMatrix)) {
rankCount <- rankCount+1
subsetCor <- corMatrix[, rankedNames]
if(class(subsetCor) != "numeric") {
subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
if(class(subsetCor) != "numeric") {
corSums <- rowSums(subsetCor)
corSumRank <- rank(corSums)
lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
nameList[[rankCount]] <- lowestCorNames
rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
} else { #1 name remaining
nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
rankList[[rankCount]] <- rankCount
}
} else {  #first iteration, subset on first name
subsetCorRank <- rank(subsetCor)
lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
nameList[[rankCount]] <- lowestCorNames
rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
}
rankedNames <- do.call(c, nameList)
}

ranks <- do.call(c, rankList)
names(ranks) <- rankedNames
if(bestHighestRank) {
ranks <- 1+length(ranks)-ranks
}
return(ranks)
}

FAAreturns <- function(prices, monthLookback = 4,
weightMom=1, weightVol=.5, weightCor=.5,
riskFreeName="VFISX", bestN=3,
stepCorRank = FALSE, stepStartMethod=c("best", "default")) {
stepStartMethod <- stepStartMethod
returns <- Return.calculate(prices)
monthlyEps <- endpoints(prices, on = "months")
riskFreeCol <- grep(riskFreeName, colnames(prices))
tmp <- list()
dates <- list()

for(i in 2:(length(monthlyEps) - monthLookback)) {

#subset data
priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]

#perform computations
momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
priceData <- priceData[, momentum > 0] #remove securities with momentum < 0
returnsData <- returnsData[, momentum > 0]
momentum <- momentum[momentum > 0]
names(momentum) <- colnames(returnsData)
vol <- as.numeric(-sd.annualized(returnsData))

if(length(momentum) > 1) {

#perform ranking
if(!stepCorRank) {
sumCors <- -colSums(cor(returnsData, use="complete.obs"))
stats <- data.frame(cbind(momentum, vol, sumCors))
ranks <- data.frame(apply(stats, 2, rank))
weightRankSum <- weightMom*ranks\$momentum + weightVol*ranks\$vol + weightCor*ranks\$sumCors
names(weightRankSum) <- rownames(ranks)
} else {
corMatrix <- cor(returnsData, use="complete.obs")
momRank <- rank(momentum)
volRank <- rank(vol)
compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
if(stepStartMethod=="default") {
stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = NULL,
stepSize = 1, bestHighestRank = TRUE)
} else {
stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = names(maxRank),
stepSize = 1, bestHighestRank = TRUE)
}
weightRankSum <- weightMom*momRank + weightVol*volRank + weightCor*stepCorRanks
}

totalRank <- rank(weightRankSum)

#find top N values, from http://stackoverflow.com/questions/2453326/fastest-way-to-find-second-third-highest-lowest-value-in-vector-or-column
#thanks to Dr. Rob J. Hyndman
upper <- length(names(returnsData))
lower <- max(upper-bestN+1, 1)
topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]

#compute weights
longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
longs <- longs/sum(longs) #equal weight all candidates
longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
names(longs) <- names(totalRank)

} else if(length(momentum) == 1) { #only one security had positive momentum
longs <- 1/bestN
names(longs) <- names(momentum)
} else { #no securities had positive momentum
longs <- 1
names(longs) <- riskFreeName
}

#append removed names (those with momentum < 0)
removedZeroes <- rep(0, ncol(returns)-length(longs))
names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
longs <- c(longs, removedZeroes)

#reorder to be in the same column order as original returns/prices
longs <- data.frame(t(longs))
longs <- longs[, names(returns)]

#append lists
tmp[[i]] <- longs
dates[[i]] <- index(returnsData)[nrow(returnsData)]
}

weights <- do.call(rbind, tmp)
dates <- do.call(c, dates)
weights <- xts(weights, order.by=as.Date(dates))
weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = FALSE)
colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
return(strategyReturns)
}

```

The FAAreturns function has been modified to transplant the stepwise correlation algorithm I discussed earlier. Essentially, the chunk of code that performs the ranking inside the function got a little bit larger, and some new arguments to the function have been introduced.

First off, there’s the option to use the stepwise correlation algorithm in the first place–namely, the stepCorRank defaulting to FALSE (the default settings replicate the original FAA idea demonstrated in the first post on this idea). However, the argument that comes next, the stepStartMethod argument does the following:

Using the “default” setting, the algorithm will start off using the security that is simply least correlated among the securities (that is, the lowest sum of correlations among securities). However, the “best” setting instead will use the weighted sum of ranks using the prior two factors (momentum and volatility). This argument defaults to using the best security (aka the one best ranked prior by the previous two factors), as opposed to the default. At the end of the day, I suppose the best way of illustrating functionality is with some examples of taking this piece of engineering out for a spin. So here goes!

```mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
"FDIVX", #Fidelity Diversified International Fund
"VEIEX", #Vanguard Emerging Markets Stock Index Fund
"VFISX", #Vanguard Short-Term Treasury Fund
"VBMFX", #Vanguard Total Bond Market Index Fund
"QRAAX", #Oppenheimer Commodity Strategy Total Return
"VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)

weightCor=1, riskFreeName="VFISX",
stepCorRank = TRUE, stepStartMethod="best")
weightCor=1, riskFreeName="VFISX",
stepCorRank = TRUE, stepStartMethod="default")
w311 <- FAAreturns(adPrices, weightMom=3, weightVol=1, weightCor=1, stepCorRank=TRUE)
originalMaxDecor <- FAAreturns(adPrices, weightMom=0, weightVol=1, stepCorRank=FALSE)
tmp <- cbind(original, originalSWCbest, originalSWCdefault,
stepMaxDecorBest, stepMaxDecorDefault, w311, originalMaxDecor)
names(tmp) <- c("original", "originalSWCbest", "originalSWCdefault", "SMDB",
"SMDD", "w311", "originalMaxDecor")
charts.PerformanceSummary(tmp, colorset=c("black", "orange", "blue", "purple", "green", "red", "darkred"))

statsTable <- data.frame(t(rbind(Return.annualized(tmp)*100, maxDrawdown(tmp)*100, SharpeRatio.annualized(tmp))))
statsTable\$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]
```

Same seven securities as the original paper, with the following return streams:

Original: the FAA original replication
originalSWCbest: original weights, stepwise correlation algorithm, using the best security as ranked by momentum and volatility as a starting point.
originalSWCdefault: original weights, stepwise correlation algorithm, using the default (minimum sum of correlations) security as a starting point.
stepMaxDecorBest: a max decorrelation algorithm that sets the momentum and volatility weights at .05 and .025 respectively, compared to 1 for correlation, simply to get the best starting security through the first two factors.
stepMaxDecorDefault: analogous to originalSWCdefault, except with the starting security being defined as the one with minimum sum of correlations.
w311: using a weighting of 3, 1, and 1 on momentum, vol, and correlation, respectively, while using the stepwise correlation rank algorithm, starting with the best security (the default for the function), since I suspected that not weighing momentum at 1 or higher was the reason any other equity curves couldn’t top out above the paper’s original.
originalMaxDecor: max decorrelation using the original 1-pass correlation matrix

Here is the performance chart: Here’s the way I interpret it:

Does David Varadi’s stepwise correlation ranking algorithm help performance? From this standpoint, the answers lead to yes. Using the original paper’s parameters, the performance over the paper’s backtest period is marginally better in terms of the equity curves. Comparing max decorrelation algorithms (SMDB and SMDD stand for stepwise max decorrelation best and default, respectively), the difference is even more clear.

However, I was wondering why I could never actually outdo the original paper’s annualized return, and out of interest, decided to more heavily weigh the momentum ranking than the original paper eventually had it set at. The result is a bumpier equity curve, but one that has a higher annualized return than any of the others. It’s also something that I didn’t try in my walk-forward example (though interested parties can simply modify the original momentum vector to contain a 1.5 weight, for instance).

Here’s the table of statistics for the different permutations:

```> statsTable
Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    14.43802       13.15625                        1.489724            1.097427
originalSWCbest             14.70544       13.15625                        1.421045            1.117753
originalSWCdefault          14.68145       13.37059                        1.457418            1.098041
SMDB                        13.55656       12.33452                        1.410072            1.099075
SMDD                        13.18864       11.94587                        1.409608            1.104033
w311                        15.76213       13.85615                        1.398503            1.137555
originalMaxDecor            11.89159       11.68549                        1.434220            1.017637
```

At the end of the day, all of the permutations exhibit solid results, and fall along different ends of the risk/return curve. The original settings exhibit the highest Sharpe Ratio (barely), but not the highest annualized return to max drawdown ratio (which surprisingly, belongs to the setting that overweights momentum).

To wrap this analysis up (since there are other strategies that I wish to replicate), here is the out-of-sample performance of these seven strategies (to Oct 30, 2014): Maybe not the greatest thing in the world considering the S&P has made some spectacular returns in 2013, but nevertheless, the momentum variant strategies established new equity highs fairly recently, and look to be on their way up from their latest slight drawdown. Here are the statistics for 2013-2014:

```statsTable <- data.frame(t(rbind(Return.annualized(tmp["2013::"])*100, maxDrawdown(tmp["2013::"])*100, SharpeRatio.annualized(tmp["2013::"]))))
statsTable\$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]

> statsTable
Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    9.284678       8.259298                       1.1966581           1.1241485
originalSWCbest             8.308246       9.657667                       0.9627413           0.8602746
originalSWCdefault          8.916144       8.985685                       1.0861781           0.9922609
SMDB                        6.406438       9.657667                       0.8366559           0.6633525
SMDD                        5.641980       5.979313                       0.7840507           0.9435833
w311                        8.921268       9.025100                       1.0142871           0.9884953
originalMaxDecor            2.888778       6.670709                       0.4244202           0.4330542
```

So, the original parameters are working solidly, the stepwise correlation algorithm seems to be in a slight rut, and the variants without any emphasis on momentum simply aren’t that great (they were created purely as illustrative tools to begin with). Whether you prefer to run FAA with these securities, or with trading strategies of your own, my only caveat is that transaction costs haven’t been taken into consideration (from what I hear, interactive brokers charges you \$1 per transaction, so it shouldn’t make a world of a difference), but beyond that, I believe these last four posts have shown that FAA is something that works. While it doesn’t always work perfectly (EG the S&P 500 had a very good 2013), the logic is sound, and the results are solid, even given some rather plain-vanilla type securities.

In any case, I think I’ll conclude with the fact that FAA works, and the stepwise correlation algorithm provides a viable alternative to computing your weights. I’ll update my IKTrading package with some formal documentation regarding this algorithm soon.

# Introducing Stepwise Correlation Rank

So in the last post, I attempted to replicate the Flexible Asset Allocation paper. I’d like to offer a thanks to Pat of Intelligent Trading Tech (not updated recently, hopefully this will change) for helping me corroborate the results, so that I have more confidence there isn’t an error in my code.

One of the procedures the authors of the FAA paper used is a correlation rank, which I interpreted as the average correlation of each security to the others.

The issue, pointed out to me in a phone conversation I had with David Varadi is that when considering correlation, shouldn’t the correlations the investor is concerned about be between instruments within the portfolio, as opposed to simply all the correlations, including to instruments not in the portfolio? To that end, when selecting assets (or possibly features in general), conceptually, it makes more sense to select in a stepwise fashion–that is, start off at a subset of the correlation matrix, and then rank assets in order of their correlation to the heretofore selected assets, as opposed to all of them. This was explained in Mr. Varadi’s recent post.

Here’s a work in progress function I wrote to formally code this idea:

```stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
#edge cases
if(dim(corMatrix) == 1) {
return(corMatrix)
} else if (dim(corMatrix) == 2) {
ranks <- c(1.5, 1.5)
names(ranks) <- colnames(corMatrix)
return(ranks)
}
if(is.null(startNames)) {
corSums <- rowSums(corMatrix)
corRanks <- rank(corSums)
startNames <- names(corRanks)[corRanks <= stepSize]
}
nameList <- list()
nameList[] <- startNames
rankList <- list()
rankCount <- 1
rankList[] <- rep(rankCount, length(startNames))
rankedNames <- do.call(c, nameList)

while(length(rankedNames) < nrow(corMatrix)) {
rankCount <- rankCount+1
subsetCor <- corMatrix[, rankedNames]
if(class(subsetCor) != "numeric") {
subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
if(class(subsetCor) != "numeric") {
corSums <- rowSums(subsetCor)
corSumRank <- rank(corSums)
lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
nameList[[rankCount]] <- lowestCorNames
rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
} else { #1 name remaining
nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
rankList[[rankCount]] <- rankCount
}
} else {  #first iteration, subset on first name
subsetCorRank <- rank(subsetCor)
lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
nameList[[rankCount]] <- lowestCorNames
rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
}
rankedNames <- do.call(c, nameList)
}

ranks <- do.call(c, rankList)
names(ranks) <- rankedNames
if(bestHighestRank) {
ranks <- 1+length(ranks)-ranks
}
return(ranks)
}
```

So the way the function works is that it takes in a correlation matrix, a starting name (if provided), and a step size (that is, how many assets to select per step, so that the process doesn’t become extremely long when dealing with larger amounts of assets/features). Then, it iterates–subset the correlation matrix on the starting name, and find the minimum value, and add it to a list of already-selected names. Next, subset the correlation matrix columns on the selected names, and the rows on the not selected names, and repeat, until all names have been accounted for. Due to R’s little habit of wiping out labels when a matrix becomes a vector, I had to write some special case code, which is the reason for two nested if/else statements (the first one being for the first column subset, and the second being for when there’s only one row remaining).

Also, if there’s an edge case (1 or 2 securities), then there is some functionality to handle those trivial cases.

Here’s a test script I wrote to test this function out:

```require(PerformanceAnalytics)
require(quantmod)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)

corMat <- cor(subset)

tmp <- list()
for(i in 1:length(mutualFunds)) {
rankRow <- stepwiseCorRank(corMat, startNames=mutualFunds[i])
tmp[[i]] <- rankRow
}
rankDemo <- do.call(rbind, tmp)
rownames(rankDemo) <- mutualFunds
origRank <- rank(rowSums(corMat))
rankDemo <- rbind(rankDemo, origRank)
rownames(rankDemo) <- "Original (VBMFX)"

heatmap(-rankDemo, Rowv=NA, Colv=NA, col=heat.colors(8), margins=c(6,6))
```

Essentially, using the 2012 year of returns for the 7 FAA mutual funds, I compared how different starting securities changed the correlation ranking sequence.

Here are the results:

```               VTSMX FDIVX VEIEX VFISX VBMFX QRAAX VGSIX
VTSMX              1     6     7     4     2     3     5
FDIVX              6     1     7     4     2     5     3
VEIEX              6     7     1     4     2     3     5
VFISX              2     6     7     1     3     4     5
VBMFX              2     6     7     4     1     3     5
QRAAX              5     6     7     4     2     1     3
VGSIX              5     6     7     4     2     3     1
Non-Sequential     5     6     7     2     1     3     4
```

In short, the algorithm is rather robust to starting security selection, at least judging by this small example. However, comparing VBMFX start to the non-sequential ranking, we see that VFISX changes from rank 2 in the non-sequential to rank 4, with VTSMX going from rank 5 to rank 2. From an intuitive perspective, this makes sense, as both VBMFX and VFISX are bond funds, which have a low correlation with the other 5 equity-based mutual funds, but a higher correlation with each other, thus signifying that the algorithm seems to be working as intended, at least insofar as this small example demonstrates. Here’s a heatmap to demonstrate this in visual form. The ranking order (starting security) is the vertical axis, and the horizontal are the ranks, from white being first, to red being last. Notice once again that the ranking orders are robust in general (consider each column of colors descending), but each particular ranking order is unique.

So far, this code still has to be tested in terms of its applications to portfolio management and asset allocation, but for those interested in such an idea, it’s my hope that this provides a good reference point.

# An Attempt At Replicating Flexible Asset Allocation (FAA)

Since the people at Alpha Architect were so kind as to feature my blog in a post, I figured I’d investigate an idea that I first found out about from their site–namely, flexible asset allocation. Here’s the SSRN, and the corresponding Alpha Architect post.

Here’s the script I used for this replication, which is completely self-contained.

```require(PerformanceAnalytics)
require(quantmod)

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
"FDIVX", #Fidelity Diversified International Fund
"VEIEX", #Vanguard Emerging Markets Stock Index Fund
"VFISX", #Vanguard Short-Term Treasury Fund
"VBMFX", #Vanguard Total Bond Market Index Fund
"QRAAX", #Oppenheimer Commodity Strategy Total Return
"VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)

FAAreturns <- function(prices, monthLookback = 4,
weightMom=1, weightVol=.5, weightCor=.5,
riskFreeName="VFISX", bestN=3) {

returns <- Return.calculate(prices)
monthlyEps <- endpoints(prices, on = "months")
riskFreeCol <- grep(riskFreeName, colnames(prices))
tmp <- list()
dates <- list()

for(i in 2:(length(monthlyEps) - monthLookback)) {

#subset data
priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]

#perform computations
momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
priceData <- priceData[, momentum > 0] #remove securities with momentum < 0
returnsData <- returnsData[, momentum > 0]
momentum <- momentum[momentum > 0]
names(momentum) <- colnames(returnsData)

vol <- as.numeric(-sd.annualized(returnsData))
#sumCors <- -colSums(cor(priceData[endpoints(priceData, on="months")]))
sumCors <- -colSums(cor(returnsData, use="complete.obs"))
stats <- data.frame(cbind(momentum, vol, sumCors))

if(nrow(stats) > 1) {

#perform ranking
ranks <- data.frame(apply(stats, 2, rank))
weightRankSum <- weightMom*ranks\$momentum + weightVol*ranks\$vol + weightCor*ranks\$sumCors
totalRank <- rank(weightRankSum)

#find top N values, from http://stackoverflow.com/questions/2453326/fastest-way-to-find-second-third-highest-lowest-value-in-vector-or-column
#thanks to Dr. Rob J. Hyndman
upper <- length(names(returnsData))
lower <- max(upper-bestN+1, 1)
topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]

#compute weights
longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
longs <- longs/sum(longs) #equal weight all candidates
longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
names(longs) <- rownames(ranks)

} else if(nrow(stats) == 1) { #only one security had positive momentum
longs <- 1/bestN
names(longs) <- rownames(stats)
} else { #no securities had positive momentum
longs <- 1
names(longs) <- riskFreeName
}

#append removed names (those with momentum < 0)
removedZeroes <- rep(0, ncol(returns)-length(longs))
names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
longs <- c(longs, removedZeroes)

#reorder to be in the same column order as original returns/prices
longs <- data.frame(t(longs))
longs <- longs[, names(returns)]

#append lists
tmp[[i]] <- longs
dates[[i]] <- index(returnsData)[nrow(returnsData)]
}

weights <- do.call(rbind, tmp)
dates <- do.call(c, dates)
weights <- xts(weights, order.by=as.Date(dates))
weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = FALSE)
return(strategyReturns)
}

N3vol1cor1 <- FAAreturns(adPrices, weightVol = 1, weightCor = 1)
minRisk <- FAAreturns(adPrices, weightMom = 0, weightVol=1, weightCor=1)
pureMomentum <- FAAreturns(adPrices, weightMom=1, weightVol=0, weightCor=0)
maxDecor <- FAAreturns(adPrices, weightMom=0, weightVol=0, weightCor=1)
momDecor <- FAAreturns(adPrices, weightMom=1, weightVol=0, weightCor=1)

all <- cbind(replicaAttempt, bestN4, N3vol1cor1, minRisk, pureMomentum, maxDecor, momDecor)
colnames(all) <- c("Replica Attempt", "N4", "vol_1_cor_1", "minRisk", "pureMomentum", "maxDecor", "momDecor")
charts.PerformanceSummary(all, colorset=c("black", "red", "blue", "green", "darkgrey", "purple", "orange"))

stats <- data.frame(t(rbind(Return.annualized(all)*100,
maxDrawdown(all)*100,
SharpeRatio.annualized(all))))
stats\$Return_To_Drawdown <- stats[,1]/stats[,2]
```

Here’s the formal procedure:

Using the monthly endpoint functionality in R, every month, looking over the past four months, I computed momentum as the most recent price over the first price in the observed set (that is, the price four months ago) minus one, and instantly removed any funds with a momentum less than zero (this was a suggestion from Mr. David Varadi of CSS Analytics, with whom I’ll be collaborating in the near future). Next, with the pared down universe, I ranked the funds by momentum, by annualized volatility (the results are identical with just standard deviation), and by the sum of the correlations with each other. Since volatility and correlation are worse at higher values, I multiplied each by negative one. Next, I invested in the top N funds every period, or if there were fewer than N funds with positive momentum, each remaining fund received a weight of 1/N, with the rest eventually being placed into the “risk-free” asset, in this case, VFISX. All price and return data were daily adjusted (as per the SSRN paper) data.

However, my results do not match the paper’s (or Alpha Architect’s) in that I don’t see the annualized returns breaking 20%, nor, most importantly, do I see the single-digit drawdowns. I hope my code is clear for every step as to what the discrepancy may be, but that aside, let me explain what the idea is.

The idea is, from those that are familiar with trend following, that in addition to seeking return through the momentum anomaly (stacks of literature available on the simple idea that what goes up will keep going up to an extent), that there is also a place for risk management. This comes in the form of ranking correlation and volatility, and giving different weights to each individual component rank (that is, momentum has a weight of 1, correlation .5, and volatility .5). Next, the weighted sum of the ranks is then also ranked (so two layers of ranking) for a final aggregate rank.

Unfortunately, when it comes to the implementation, the code has to be cluttered with some data munging and edge-case checking, which takes a little bit away from the readability. To hammer a slight technical tangent home, in R, whenever one plans on doing iterated appending (E.G. one table that’s repeatedly appended), due to R copying an object on assignment when doing repeated rbinding or cbinding, but simply appending the last iteration onto a list object, outside of tiny data frames, it’s always better to use a list and only call rbind/cbind once at the end. The upside to data frames is that they’re much easier to print out to a console and to do vectorized operations on. However, lists are more efficient when it comes to iteration.

In any case, here’s an examination of some variations of this strategy.

The first is a simple attempt at replication (3 of 7 securities, 1 weight to momentum, .5 to volatility and correlation each). The second is that same setting, just with the top four securities instead of the top three. A third one is with three securities, but double the weighting to the two risk metrics (vol & cor). The next several are conceptual permutations–a risk minimization profile that puts no weight on the actual nature of momentum (analogous to what the Smart Beta folks would call min-vol), a pure momentum strategy (disregard vol and cor), a max decorrelation strategy (all weight on correlation), and finally, a hybrid of momentum and max decorrelation.

Here is the performance chart: Overall, this looks like evidence of robustness, given that I fundamentally changed the nature of the strategies in quite a few cases, rather than simply tweaked the weights here or there. The momentum/decorrelation hybrid is a bit difficult to see, so here’s a clearer image for how it compared with the original strategy. Overall, a slightly smoother ride, though slightly lower in terms of returns. Here’s the table comparing all seven variations:

```> stats
Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. Return_To_Drawdown
Replica Attempt          14.43802      13.156252                        1.489724          1.0974268
N4                       12.48541      10.212778                        1.492447          1.2225281
vol_1_cor_1              12.86459      12.254390                        1.608721          1.0497944
minRisk                  11.26158       9.223409                        1.504654          1.2209786
pureMomentum             13.88501      14.401121                        1.135252          0.9641619
maxDecor                 11.89159      11.685492                        1.434220          1.0176368
momDecor                 14.03615      10.951574                        1.489358          1.2816563
```

Overall, there doesn’t seem to be any objectively best variant, though pure momentum is definitely the worst (as may be expected, otherwise the original paper wouldn’t be as meaningful). If one is looking for return to max drawdown, then the momentum/max decorrelation hybrid stands out, though the 4-security variant and minimum risk variants also work (though they’d have to be leveraged a tiny bit to get the annualized returns to the same spot). On Sharpe Ratio, the variant with double the original weighting on volatility and correlation stands out, though its return to drawdown ratio isn’t the greatest.

However, the one aspect that I take away from this endeavor is that the number of assets were relatively tiny, and the following statistic:

```> SharpeRatio.annualized(Return.calculate(adPrices))
VTSMX     FDIVX     VEIEX    VFISX    VBMFX       QRAAX     VGSIX
Annualized Sharpe Ratio (Rf=0%) 0.2520994 0.3569858 0.2829207 1.794041 1.357554 -0.01184516 0.3062336
```

Aside from the two bond market funds, which are notorious for lower returns for lower risk, the Sharpe ratios of the individual securities are far below 1. The strategy itself, on the other hand, has very respectable Sharpe ratios, working with some rather sub-par components.

Simply put, consider running this asset allocation heuristic on your own set of strategies, as opposed to pre-set funds. Furthermore, it is highly likely that the actual details of the ranking algorithm can be improved, from different ranking metrics (add drawdown?) to more novel concepts such as stepwise correlation ranking/selection.