How well can you scale your strategy?

This post will deal with a quick, finger in the air way of seeing how well a strategy scales–namely, how sensitive it is to latency between signal and execution, using a simple volatility trading strategy as an example. The signal will be the VIX/VXV ratio trading VXX and XIV, an idea I got from Volatility Made Simple’s amazing blog, particularly this post. The three signals compared will be the “magical thinking” signal (observe the close, buy the close, named from the ruleOrderProc setting in quantstrat), buy on next-day open, and buy on next-day close.

Let’s get started.

require(downloader)
require(PerformanceAnalytics)
require(IKTrading)
require(TTR)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", 
         destfile="vxvData.csv")
download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")
download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package
getSymbols('^VIX', from = '1990-01-01')


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vixVxv <- Cl(VIX)/Cl(vxv)


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))

vxxCloseRets <- Return.calculate(Cl(vxx))
vxxOpenRets <- Return.calculate(Op(vxx))
xivCloseRets <- Return.calculate(Cl(xiv))
xivOpenRets <- Return.calculate(Op(xiv))

vxxSig <- vixVxv > 1
xivSig <- 1-vxxSig

magicThinking <- vxxCloseRets * lag(vxxSig) + xivCloseRets * lag(xivSig)
nextOpen <- vxxOpenRets * lag(vxxSig, 2) + xivOpenRets * lag(xivSig, 2)
nextClose <- vxxCloseRets * lag(vxxSig, 2) + xivCloseRets * lag(xivSig, 2)
tradeWholeDay <- (nextOpen + nextClose)/2

compare <- na.omit(cbind(magicThinking, nextOpen, nextClose, tradeWholeDay))
colnames(compare) <- c("Magic Thinking", "Next Open", 
                       "Next Close", "Execute Through Next Day")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), 
      maxDrawdown(compare), CalmarRatio(compare))

par(mfrow=c(1,1))
chart.TimeSeries(log(cumprod(1+compare), base = 10), legend.loc='topleft', ylab='log base 10 of additional equity',
                 main = 'VIX vx. VXV different execution times')

So here’s the run-through. In addition to the magical thinking strategy (observe the close, buy that same close), I tested three other variants–a variant which transacts the next open, a variant which transacts the next close, and the average of those two. Effectively, I feel these three could give a sense of a strategy’s performance under more realistic conditions–that is, how well does the strategy perform if transacted throughout the day, assuming you’re managing a sum of money too large to just plow into the market in the closing minutes (and if you hope to get rich off of trading, you will have a larger sum of money than the amount you can apply magical thinking to). Ideally, I’d use VWAP pricing, but as that’s not available for free anywhere I know of, that means that readers can’t replicate it even if I had such data.

In any case, here are the results.

Equity curves:

Log scale (for Mr. Tony Cooper and others):

Stats:

                          Magic Thinking Next Open Next Close Execute Through Next Day
Annualized Return               0.814100 0.8922000  0.5932000                 0.821900
Annualized Std Dev              0.622800 0.6533000  0.6226000                 0.558100
Annualized Sharpe (Rf=0%)       1.307100 1.3656000  0.9529000                 1.472600
Worst Drawdown                  0.566122 0.5635336  0.6442294                 0.601014
Calmar Ratio                    1.437989 1.5831686  0.9208586                 1.367510

My reaction? The execute on next day’s close performance being vastly lower than the other configurations (and that deterioration occurring in the most recent years) essentially means that the fills will have to come pretty quickly at the beginning of the day. While the strategy seems somewhat scalable through the lens of this finger-in-the-air technique, in my opinion, if the first full day of possible execution after signal reception will tank a strategy from a 1.44 Calmar to a .92, that’s a massive drop-off, after holding everything else constant. In my opinion, I think this is quite a valid question to ask anyone who simply sells signals, as opposed to manages assets. Namely, how sensitive are the signals to execution on the next day? After all, unless those signals come at 3:55 PM, one is most likely going to be getting filled the next day.

Now, while this strategy is a bit of a tomato can in terms of how good volatility trading strategies can get (they can get a *lot* better in my opinion), I think it made for a simple little demonstration of this technique. Again, a huge thank you to Mr. Helmuth Vollmeier for so kindly keeping up his dropbox all this time for the volatility data!

Thanks for reading.

NOTE: I am currently contracting in a data science capacity in Chicago. You can email me at ilya.kipnis@gmail.com, or find me on my LinkedIn here. I’m always open to beers after work if you’re in the Chicago area.

NOTE 2: Today, on October 21, 2015, if you’re in Chicago, there’s a Chicago R Users Group conference at Jaks Tap at 6:00 PM. Free pizza, networking, and R, hosted by Paul Teetor, who’s a finance guy. Hope to see you there.

12 thoughts on “How well can you scale your strategy?

  1. Pingback: How well can you scale your strategy? | Mubashir Qasim

  2. Pingback: Quantocracy's Daily Wrap for 10/21/2015 | Quantocracy

  3. Hi Ilya,
    Thanks once again for sharing all the work you do here on the blog. Your demonstration about scalability confirms to me how unrealistic some of the trading strategies are for individual investors. This is why I was a bit disappointed to see you wanting to move into the direction of hypothesis testing and reliance on academic papers. A lot of the work academic papers report involve use of large data sets from CRSP and from that suggesting there are statistically significant results that can be used for trading. In fact when you try them they don’t work. So the Brian Peterson framework might work for the Institutional crowd but not for individual investors. Just my two cents.

    Fred.

    • These two things are actually completely independent of one another. My hypothesis testing was done on a strategy that retail investors (at least in America) can indeed replicate, at least in their IRAs.

      Yes, some investment ideas, such as holding a portfolio of 100 stocks or whatnot, is ridiculous for a retail investor. Luckily, there are ETFs to do a bunch of cool things that retail investors wouldn’t be able to.

  4. The title is misleading. This post is about how fill assumptions affect strategy performance, not about the capacity at which you can scale the strategy (i.e. how much capital you can deploy with it).

    • That sort of goes hand in hand. If you plow so much capital into a strategy that you get worse fills, your strategy performance suffers. At least, that’s what my line of thinking was!

      • It’s true that market impact becomes a factor for some level of capital, but you’re not measuring that here. You’re just calculating a rough fill price estimate because you only have daily OHLC data. Nothing wrong with that, but it doesn’t come close to telling you how much capital you can deploy without market impact overwhelming your edge.

  5. I fully agree with Joshua. You are not measuing market impact neither you measure the capacity for a strat. It is a question of timing. Institutional investor measure the gap between trade idea and actuall execution price via a pre and post trade analysis. If you want to have more insights contact me on email

  6. Pingback: Best Links of the Week | Quantocracy

  7. The comment by Joshua is valid but I thought that Ilya meant “scaling in time”. Regardless, what concerns me more is that XIV started trading on November 30, 2010 and the backetest starts in December of 2007. Apparently, someone has estimated the price series before that, is that the case/ If that is the case then I would be very careful in drawing any conclusions from that. My backtest results from next open and next close also show high CAGR in the order of 60% but Sharpe is much less, below 0.70 (Rf = 0).For next close I get 30% CAGR, much less, which is expected. When you backtest for a fill at next close, do you also exit the position at next close?

    Anyway, this is good work but the drawdown levels are prohibitive.

    • Yes, exit also next close. Volatility trading strategies at daily frequencies indeed have very high drawdowns, but IMO, it’s easier to simply scale down a very high volatility strategy (new strategy: old strategy * 20%, cash * 80%, for instance), than it is to leverage a low-volatility strategy.

  8. Ilya, I won’t ask for contact information regarding Helmuth Vollmeier, but would like to draw attention to a possible glitch that has appeared in the VXX data you list at his dropbox URL. The adjusted prices look fine and match what Yahoo lists back to August 5, 2014, but starting Aug 4 and prior there is a 0.25x glitch… values seem low by a factor of 4 (i.e., values match on Aug 5, but on Aug 4 adjusted close is 126.20 in Yahoo and only 31.55 in Helmuth’s data). The 4x correction needed is easily made, but use of the dropbox data in your script does not account for this. Perhaps you could drop a line to him about this issue.

Leave a comment