GARCH and a rudimentary application to Vol Trading

This post will review Kris Boudt’s datacamp course, along with introducing some concepts from it, discuss GARCH, present an application of it to volatility trading strategies, and a somewhat more general review of datacamp.

So, recently, Kris Boudt, one of the highest-ranking individuals pn the open-source R/Finance totem pole (contrary to popular belief, I am not the be-all end-all of coding R in finance…probably just one of the more visible individuals due to not needing to run a trading desk), taught a course on Datacamp on GARCH models.

Naturally, an opportunity to learn from one of the most intelligent individuals in the field in a hand-held course does not come along every day. In fact, on Datacamp, you can find courses from some of the most intelligent individuals in the R/Finance community, such as Joshua Ulrich, Ross Bennett (teaching PortfolioAnalytics, no less), David Matteson, and, well, just about everyone short of Doug Martin and Brian Peterson themselves. That said, most of those courses are rather introductory, but occasionally, you get a course that covers a production-tier library that allows one to do some non-trivial things, such as this course, which covers Alexios Ghalanos’s rugarch library.

Ultimately, the course is definitely good for showing the basics of rugarch. And, given how I blog and use tools, I wholly subscribe to the 80/20 philosophy–essentially that you can get pretty far using basic building blocks in creative ways, or just taking a particular punchline and applying it to some data, and throwing it into a trading strategy to see how it does.

But before we do that, let’s discuss what GARCH is.

While I’ll save the Greek notation for those that feel inclined to do a google search, here’s the acronym:

Generalized Auto-Regressive Conditional Heteroskedasticity

What it means:

Generalized: a more general form of the

Auto-Regressive: past values are used as inputs to predict future values.

Conditional: the current value differs given a past value.

Heteroskedasticity: varying volatility. That is, consider the VIX. It isn’t one constant level, such as 20. It varies with respect to time.

Or, to summarize: “use past volatility to predict future volatility because it changes over time.”

Now, there are some things that we know from empirical observation about looking at volatility in financial time series–namely that volatility tends to cluster–high vol is followed by high vol, and vice versa. That is, you don’t just have one-off huge moves one day, then calm moves like nothing ever happened. Also, volatility tends to revert over longer periods of time. That is, VIX doesn’t stay elevated for protracted periods of time, so more often than not, betting on its abatement can make some money, (assuming the timing is correct.)

Now, in the case of finance, which birthed the original GARCH, 3 individuals (Glosten-Jagannathan-Runkle) extended the model to take into account the fact that volatility in an asset spikes in the face of negative returns. That is, when did the VIX reach its heights? In the biggest of bear markets in the financial crisis. So, there’s an asymmetry in the face of positive and negative returns. This is called the GJR-GARCH model.

Now, here’s where the utility of the rugarch package comes in–rather than needing to reprogram every piece of math, Alexios Ghalanos has undertaken that effort for the good of everyone else, and implemented a whole multitude of prepackaged GARCH models that allow the end user to simply pick the type of GARCH model that best fits the assumptions the end user thinks best apply to the data at hand.

So, here’s the how-to.

First off, we’re going to get data for SPY from Yahoo finance, then specify our GARCH model.

The GARCH model has three components–the mean model–that is, assumptions about the ARMA (basic ARMA time series nature of the returns, in this case I just assumed an AR(1)), a variance model–which is the part in which you specify the type of GARCH model, along with variance targeting (which essentially forces an assumption of some amount of mean reversion, and something which I had to use to actually get the GARCH model to converge in all cases), and lastly, the distribution model of the returns. In many models, there’s some in-built assumption of normality. In rugarch, however, you can relax that assumption by specifying something such as “std” — that is, the Student T Distribution, or in this case, “sstd”–Skewed Student T Distribution. And when one thinks about the S&P 500 returns, a skewed student T distribution seems most reasonable–positive returns usually arise as a large collection of small gains, but losses occur in large chunks, so we want a distribution that can capture this property if the need arises.


# get SPY data from Yahoo 
getSymbols("SPY", from = '1990-01-01')

spyRets = na.omit(Return.calculate(Ad(SPY)))

# GJR garch with AR1 innovations under a skewed student T distribution for returns
gjrSpec = ugarchspec(mean.model = list(armaOrder = c(1,0)),
                      variance.model = list(model = "gjrGARCH",
                                            variance.targeting = TRUE),
                      distribution.model = "sstd")

As you can see, with a single function call, the user can specify a very extensive model encapsulating assumptions about both the returns and the model which governs their variance. Once the model is specified,it’s equally simple to use it to create a rolling out-of-sample prediction–that is, just plug your data in, and after some burn-in period, you start to get predictions for a variety of metrics. Here’s the code to do that. 

# Use rolling window of 504 days, refitting the model every 22 trading days
t1 = Sys.time()
garchroll = ugarchroll(gjrSpec, data = spyRets, 
n.start = 504, refit.window = "moving", refit.every = 22)
t2 = Sys.time()

# convert predictions to data frame
garchroll =

In this case, I use a rolling 504 day window that refits every 22 days(approximately 1 trading month). To note, if the window is too short,you may run into fail-to-converge instances, which would disallow converting the predictions to a data frame. The rolling predictions take about four minutes to run on the server instance I use, so refitting every single day is most likely not advised.

Here’s how the predictions look like:

                      Mu       Sigma      Skew    Shape Shape(GIG)      Realized
1995-01-30  6.635618e-06 0.005554050 0.9456084 4.116495          0 -0.0043100611
1995-01-31  4.946798e-04 0.005635425 0.9456084 4.116495          0  0.0039964165
1995-02-01  6.565350e-06 0.005592726 0.9456084 4.116495          0 -0.0003310769
1995-02-02  2.608623e-04 0.005555935 0.9456084 4.116495          0  0.0059735255
1995-02-03 -1.096157e-04 0.005522957 0.9456084 4.116495          0  0.0141870212
1995-02-06 -5.922663e-04 0.005494048 0.9456084 4.116495          0  0.0042281655

The salient quantity here is the Sigma quantity–that is, the prediction for daily volatility. This is the quantity that we want to compare against the VIX.

So the strategy we’re going to be investigating is essentially what I’ve seen referred to as VRP–the Volatility Risk Premium in Tony Cooper’s seminal paper, Easy Volatility Investing.

The idea of the VRP is that we compare some measure of realized volatility (EG running standard deviation, GARCH predictions from past data) to the VIX, which is an implied volatility (so, purely forward looking). The idea is that when realized volatility (past/current measured) is greater than future volatility, people are in a panic. Similarly, when implied volatility is greater than realized volatility, things are as they should be, and it should be feasible to harvest the volatility risk premium by shorting volatility (analogous to selling insurance).

The instruments we’ll be using for this are ZIV and VXZ. ZIV because SVXY is no longer supported on InteractiveBrokers or RobinHood, and then VXZ is its long volatility counterpart.

We’ll be using close-to-close returns; that is, get the signal on Monday morning, and transact on Monday’s close, rather than observe data on Friday’s close, and transact around that time period as well(also known as magical thinking, according to Brian Peterson).

getSymbols('^VIX', from = '1990-01-01')

# convert GARCH sigma predictions to same scale as the VIX by annualizing, multiplying by 100
garchPreds = xts(garchroll$Sigma * sqrt(252) * 100,
diff = garchPreds - Ad(VIX)


download('', destfile='VXZlong.txt')
download('', destfile='ZIVlong.txt')

ziv = xts(read.zoo('ZIVlong.txt', format='%Y-%m-%d', sep = ',', header=TRUE))
vxz = xts(read.zoo('VXZlong.txt', format = '%Y-%m-%d', sep = ',', header = TRUE))

zivRets = na.omit(Return.calculate(Cl(ziv)))
vxzRets = na.omit(Return.calculate(Cl(vxz)))
vxzRets['2014-08-05'] = .045

zivSig = diff < 0 
vxzSig = diff > 0 

garchOut = lag(zivSig, 2) * zivRets + lag(vxzSig, 2) * vxzRets

histSpy = runSD(spyRets, n = 21, sample = FALSE) * sqrt(252) * 100
spyDiff = histSpy - Ad(VIX)

zivSig = spyDiff < 0 
zivSig = spyDiff > 0 

spyOut = lag(zivSig, 2) * zivRets + lag(vxzSig, 2) * vxzRets

avg = (garchOut + spyOut)/2
compare = na.omit(cbind(garchOut, spyOut, avg))
colnames(compare) = c("gjrGARCH", "histVol", "avg")

With the following output:

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] = stats[1,]/stats[4,]
  stats[6,] = stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] = "Worst Drawdown"
  rownames(stats)[5] = "Calmar Ratio"
  rownames(stats)[6] = "Ulcer Performance Index"


> stratStats(compare)
                           gjrGARCH   histVol       avg
Annualized Return         0.2195000 0.2186000 0.2303000
Annualized Std Dev        0.2936000 0.2947000 0.2614000
Annualized Sharpe (Rf=0%) 0.7477000 0.7419000 0.8809000
Worst Drawdown            0.4310669 0.5635507 0.4271594
Calmar Ratio              0.5092017 0.3878977 0.5391429
Ulcer Performance Index   1.3563017 1.0203611 1.5208926

So, to comment on this strategy: this is definitely not something you will take and trade out of the box. Both variants of this strategy, when forced to choose a side, walk straight into the Feb 5 volatility explosion. Luckily, switching between ZIV and VXZ keeps the account from completely exploding in a spectacular failure. To note, both variants of the VRP strategy, GJR Garch and the 22 day rolling realized volatility, suffer their own period of spectacularly large drawdown–the historical volatility in 2007-2008, and currently, though this year has just been miserable for any reasonable volatility strategy, I myself am down 20%, and I’ve seen other strategists down that much as well in their primary strategies.

That said, I do think that over time, and if using the tail-end-of-the-curve instruments such as VXZ and ZIV (now that XIV is gone and SVXY no longer supported on several brokers such as Interactive Brokers and RobinHood), that there are a number of strategies that might be feasible to pass off as a sort of trading analogue to machine learning’s “weak learner”.

That said, I’m not sure how many vastly different types of ways to approach volatility trading there are that make logical sense from an intuitive perspective (that is, “these two quantities have this type of relationship, which should give a consistent edge in trading volatility” rather than “let’s over-optimize these two parameters until we eliminate every drawdown”).

While I’ve written about the VIX3M/VIX6M ratio in the past, which has formed the basis of my proprietary trading strategy, I’d certainly love to investigate other volatility trading ideas out in public. For instance, I’d love to start the volatility trading equivalent of an AllocateSmartly type website–just a compendium of a reasonable suite of volatility trading strategies, track them, charge a subscription fee, and let users customize their own type of strategies. However, the prerequisite for that is that there are a lot of reasonable ways to trade volatility that don’t just walk into tail-end events such as the 2007-2008 transition, Feb 5, and so on.

Furthermore, as some recruiters have told me that I should also cross-post my blog scripts on my Github, I’ll start doing that also, from now on.

One last topic: a general review of Datacamp. As some of you may know, I instruct a course on datacamp. But furthermore, I’ve spent quite a bit of time taking courses (particularly in Python) on there as well, thanks to having access by being an instructor.

Generally, here’s the gist of it: Datacamp is a terrific resource for getting your feet wet and getting a basic overview of what technologies are out there. Generally, courses follow a “few minutes of lecture, do exercises using the exact same syntax you saw in the lecture”, with a lot of the skeleton already written for you, so you don’t wind up endlessly guessing. Generally, my procedure will be: “try to complete the exercise, and if I fail, go back and look at the slides to find an analogous block of code, change some names, and fill in”. 

Ultimately, if the world of data science, machine learning, and some quantitative finance is completely new to you–if you’re the kind of person that reads my blog, and completely glosses past the code: *this* is the resource for you, and I recommend it wholeheartedly. You’ll take some courses that give you a general tour of what data scientists, and occasionally, quants, do. And in some cases, you may have a professor in a fairly advanced field, like Kris Boudt, teach a fairly advanced topic, like the state-of-the art rugarch package (this *is* an industry-used package, and is actively maintained by Alexios Ghalanos, an economist at Amazon, so it’s far more than a pedagogical tool).

That said, for someone like me, who’s trying to port his career-capable R skills to Python to land a job (my last contract ended recently, so I am formally searching for a new role), Datacamp doesn’t *quite* do the trick–just yet. While there is a large catalog of courses, it does feel like there’s a lot of breadth, though not sure how much depth in terms of getting proficient enough to land interviews on the sole merits of DataCamp course completions. While there are Python course tracks (EG python developer, which I completed, and Python data analyst, which I also completed), I’m not sure they’re sufficient in terms of “this track was developed with partnership in industry–complete this capstone course, and we have industry partners willing to interview you”.

Also, from what I’ve seen of quantitative finance taught in Python, and having to rebuild all functions from numpy/pandas, I am puzzled as to   how people do quantitative finance in Python without libraries like PerformanceAnalytics, rugarch, quantstrat, PortfolioAnalytics, and so on. Those libraries make expressing and analyzing investment ideas far more efficient, and removes a great chance of making something like an off-by-one error (also known as look-ahead bias in trading). So far, I haven’t seen the Python end of Datacamp dive deep into quantitative finance, and I hope that changes in the near future.

So, as a summary, I think this is a fantastic site for code-illiterate individuals to get their hands dirty and their feet wet with some coding, but I think the opportunity to create an economic, democratized, interest to career a-la-carte, self-paced experience is still very much there for the taking. And given the quality of instructors that Datacamp has worked with in the past (David Matteson–*the* regime change expert, I think–along with many other experts), I think Datacamp has a terrific opportunity to capitalize here.

So, if you’re the kind of person who glosses past the code: don’t gloss anymore. You can now take courses to gain an understanding of what my code does, and ask questions about it.

Thanks for reading.

NOTE: I am currently looking for networking opportunities and full-time roles related to my skill set. Feel free to download my resume or contact me on LinkedIn.


Principal Component Momentum?

This post will investigate using Principal Components as part of a momentum strategy.

Recently, I ran across a post from David Varadi that I thought I’d further investigate and translate into code I can explicitly display (as David Varadi doesn’t). Of course, as David Varadi is a quantitative research director with whom I’ve done good work with in the past, I find that trying to investigate his ideas is worth the time spent.

So, here’s the basic idea: in an allegedly balanced universe, containing both aggressive (e.g. equity asset class ETFs) assets and defensive assets (e.g. fixed income asset class ETFs), that principal component analysis, a cornerstone in machine learning, should have some effectiveness at creating an effective portfolio.

I decided to put that idea to the test with the following algorithm:

Using the same assets that David Varadi does, I first use a rolling window (between 6-18 months) to create principal components. Making sure that the SPY half of the loadings is always positive (that is, if the loading for SPY is negative, multiply the first PC by -1, as that’s the PC we use), and then create two portfolios–one that’s comprised of the normalized positive weights of the first PC, and one that’s comprised of the negative half.

Next, every month, I use some momentum lookback period (1, 3, 6, 10, and 12 months), and invest in the portfolio that performed best over that period for the next month, and repeat.

Here’s the source code to do that: (and for those who have difficulty following, I highly recommend James Picerno’s Quantitative Investment Portfolio Analytics in R book.


symbols <- c("SPY", "EFA", "EEM", "DBC", "HYG", "GLD", "IEF", "TLT")  

# get free data from yahoo
rets <- list()
getSymbols(symbols, src = 'yahoo', from = '1990-12-31')
for(i in 1:length(symbols)) {
  returns <- Return.calculate(Ad(get(symbols[i])))
  colnames(returns) <- symbols[i]
  rets[[i]] <- returns
rets <- na.omit(, rets))

# 12 month PC rolling PC window, 3 month momentum window
pcPlusMinus <- function(rets, pcWindow = 12, momWindow = 3) {
  ep <- endpoints(rets)

  wtsPc1Plus <- NULL
  wtsPc1Minus <- NULL
  for(i in 1:(length(ep)-pcWindow)) {
    # get subset of returns
    returnSubset <- rets[(ep[i]+1):(ep[i+pcWindow])]
    # perform PCA, get first PC (I.E. pc1)
    pcs <- prcomp(returnSubset) 
    firstPc <- pcs[[2]][,1]
    # make sure SPY always has a positive loading
    # otherwise, SPY and related assets may have negative loadings sometimes
    # positive loadings other times, and creates chaotic return series
    if(firstPc['SPY'] < 0) {
      firstPc <- firstPc * -1
    # create vector for negative values of pc1
    wtsMinus <- firstPc * -1
    wtsMinus[wtsMinus < 0] <- 0
    wtsMinus <- wtsMinus/(sum(wtsMinus)+1e-16) # in case zero weights
    wtsMinus <- xts(t(wtsMinus),
    wtsPc1Minus[[i]] <- wtsMinus
    # create weight vector for positive values of pc1
    wtsPlus <- firstPc
    wtsPlus[wtsPlus < 0] <- 0
    wtsPlus <- wtsPlus/(sum(wtsPlus)+1e-16)
    wtsPlus <- xts(t(wtsPlus),
    wtsPc1Plus[[i]] <- wtsPlus
  # combine positive and negative PC1 weights
  wtsPc1Minus <-, wtsPc1Minus)
  wtsPc1Plus <-, wtsPc1Plus)
  # get return of PC portfolios
  pc1MinusRets <- Return.portfolio(R = rets, weights = wtsPc1Minus)
  pc1PlusRets <- Return.portfolio(R = rets, weights = wtsPc1Plus)
  # combine them
  combine <-na.omit(cbind(pc1PlusRets, pc1MinusRets))
  colnames(combine) <- c("PCplus", "PCminus")
  momEp <- endpoints(combine)
  momWts <- NULL
  for(i in 1:(length(momEp)-momWindow)){
    momSubset <- combine[(momEp[i]+1):(momEp[i+momWindow])]
    momentums <- Return.cumulative(momSubset)
    momWts[[i]] <- xts(momentums==max(momentums),
  momWts <-, momWts)
  out <- Return.portfolio(R = combine, weights = momWts)
  colnames(out) <- paste("PCwin", pcWindow, "MomWin", momWindow, sep="_")
  return(list(out, wtsPc1Minus, wtsPc1Plus, combine))

pcWindows <- c(6, 9, 12, 15, 18)
momWindows <- c(1, 3, 6, 10, 12)

permutes <- expand.grid(pcWindows, momWindows)

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"

results <- NULL
for(i in 1:nrow(permutes)) {
  tmp <- pcPlusMinus(rets = rets, pcWindow = permutes$Var1[i], momWindow = permutes$Var2[i])
  results[[i]] <- tmp[[1]]
results <-, results)
stats <- stratStats(results)

After a cursory look at the results, it seems the performance is fairly miserable with my implementation, even by the standards of tactical asset allocation models (the good ones have a Calmar and Sharpe Ratio above 1)

Here are histograms of the Calmar and Sharpe ratios.


These values are generally too low for my liking. Here’s a screenshot of the table of all 25 results.


While my strategy of choosing which portfolio to hold is different from David Varadi’s (momentum instead of whether or not the aggressive portfolio is above its 200-day moving average), there are numerous studies that show these two methods are closely related, yet the results feel starkly different (and worse) compared to his site.

I’d certainly be willing to entertain suggestions as to how to improve the process, which will hopefully create some more meaningful results. I also know that AllocateSmartly expressed interest in implementing something along these lines for their estimable library of TAA strategies, so I thought I’d try to do it and see what results I’d find, which in this case, aren’t too promising.

Thanks for reading.

NOTE: I am networking, and actively seeking a position related to my skill set in either Philadelphia, New York City, or remotely. If you know of a position which may benefit from my skill set, feel free to let me know. You can reach me on my LinkedIn profile here, or email me.

A Review of James Picerno’s Quantitative Investment Portfolio Analytics in R

This is a review of James Picerno’s Quantitative Investment Portfolio Analytics in R. Overall, it’s about as fantastic a book as you can get on portfolio optimization until you start getting into corner cases stemming from large amounts of assets.

Here’s a quick summary of what the book covers:

1) How to install R.

2) How to create some rudimentary backtests.

3) Momentum.

4) Mean-Variance Optimization.

5) Factor Analysis

6) Bootstrapping/Monte-Carlo simulations.

7) Modeling Tail Risk

8) Risk Parity/Vol Targeting

9) Index replication

10) Estimating impacts of shocks

11) Plotting in ggplot

12) Downloading/saving data.

All in all, the book teaches the reader many fantastic techniques to get started doing some basic portfolio management using asset-class ETFs, and under the assumption of ideal data–that is, that there are few assets with concurrent starting times, that the number of assets is much smaller than the number of observations (I.E. 10 asset class ETFs, 90 day lookback windows, for instance), and other attributes taken for granted to illustrate concepts. I myself have used these concepts time and again (and, in fact, covered some of these topics on this blog, such as volatility targeting, momentum, and mean-variance), but in some of the work projects I’ve done, the trouble begins when the number of assets grows larger than the number of observations, or when assets move in or out of the investable universe (EG a new company has an IPO or a company goes bankrupt/merges/etc.). It also does not go into the PortfolioAnalytics package, developed by Ross Bennett and Brian Peterson. Having recently started to use this package for a real-world problem, it produces some very interesting results and its potential is immense, with the large caveat that you need an immense amount of computing power to generate lots of results for large-scale problems, which renders it impractical for many individual users. A quadratic optimization on a backtest with around 2400 periods and around 500 assets per rebalancing period (days) took about eight hours on a cloud server (when done sequentially to preserve full path dependency).

However, aside from delving into some somewhat-edge-case appears-more-in-the-professional-world topics, this book is extremely comprehensive. Simply, as far as managing a portfolio of asset-class ETFs (essentially, what the inimitable Adam Butler and crew from ReSolve Asset Management talk about, along with Walter’s fantastic site, AllocateSmartly), this book will impart a lot of knowledge that goes into doing those things. While it won’t make you as comfortable as say, an experienced professional like myself is at writing and analyzing portfolio optimization backtests, it will allow you to do a great deal of your own analysis, and certainly a lot more than anyone using Excel.

While I won’t rehash what the book covers in this post, what I will say is that it does cover some of the material I’ve posted in years past. And furthermore, rather than spending half the book about topics such as motivations, behavioral biases, and so on, this book goes right into the content that readers should know in order to execute the tasks they desire. Furthermore, the content is presented in a very coherent, English-and-code, matter-of-fact way, as opposed to a bunch of abstract mathematical derivations that treats practical implementation as an afterthought. Essentially, when one buys a cookbook, they don’t get it to read half of it for motivations as to why they should bake their own cake, but on how to do it. And as far as density of how-to, this book delivers in a way I think that other authors should strive to emulate.

Furthermore, I think that this book should be required reading for any analyst wanting to work in the field. It’s a very digestible “here’s how you do X” type of book. I.E. “here’s a data set, write a backtest based on these momentum rules, use an inverse-variance weighting scheme, do a Fama-French factor analysis on it”.

In any case, in my opinion, for anyone doing any sort of tactical asset allocation analysis in R, get this book now. For anyone doing any sort of tactical asset allocation analysis in spreadsheets, buy this book sooner than now, and then see the previous sentence. In any case, I’ll certainly be keeping this book on my shelf and referencing it if need be.

Thanks for reading.

Note: I am currently contracting but am currently on the lookout for full-time positions in New York City. If you know of a position which may benefit from my skills, please let me know. My LinkedIn profile can be found here.

A Different Way To Think About Drawdown — Geometric Calmar Ratio

This post will discuss the idea of the geometric Calmar ratio — a way to modify the Calmar ratio to account for compounding returns.

So, one thing that recently had me sort of annoyed in terms of my interpretation of the Calmar ratio is this: essentially, the way I interpret it is that it’s a back of the envelope measure of how many years it takes you to recover from the worst loss. That is, if a strategy makes 10% a year (on average), and has a loss of 10%, well, intuition serves that from that point on, on average, it’ll take about a year to make up that loss–that is, a Calmar ratio of 1. Put another way, it means that on average, a strategy will make money at the end of 252 trading days.

But, that isn’t really the case in all circumstances. If an investment manager is looking to create a small, meager return for their clients, and is looking to make somewhere between 5-10%, then sure, the Calmar ratio approximation and interpretation makes sense in that context. Or, it makes sense in the context of “every year, we withdraw all profits and deposit to make up for any losses”. But in the context of a hedge fund trying to create large, market-beating returns for its investors, those hedge funds can have fairly substantial drawdowns.

Citadel–one of the gold standards of the hedge fund industry, had a drawdown of more than 50% during the financial crisis, and of course, there was least one fund that blew up in the storm-in-a-teacup volatility spike on Feb. 5 (in other words, if those guys were professionals, what does that make me? Or if I’m an amateur, what does that make them?).

In any case, in order to recover from such losses, it’s clear that a strategy would need to make back a lot more than what it lost. Lose 25%? 33% is the high water mark. Lose 33%? 50% to get back to even. Lose 50%? 100%. Beyond that? You get the idea.

In order to capture this dynamic, we should write a new Calmar ratio to express this idea.

So here’s a function to compute the geometric calmar ratio:


geomCalmar <- function(r) {
  rAnn <- Return.annualized(r)
  maxDD <- maxDrawdown(r)
  toHighwater <- 1/(1-maxDD) - 1
  out <- rAnn/toHighwater

So, let's compare how some symbols stack up. We'll take a high-volatility name (AMZN), the good old S&P 500 (SPY), and a very low volatility instrument (SHY).

getSymbols(c('AMZN', 'SPY', 'SHY'), from = '1990-01-01')
rets <- na.omit(cbind(Return.calculate(Ad(AMZN)), Return.calculate(Ad(SPY)), Return.calculate(Ad(SHY))))
compare <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets), CalmarRatio(rets), geomCalmar(rets))
rownames(compare)[6] <- "Geometric Calmar"

The returns start from July 31, 2002. Here are the statistics.

                           AMZN.Adjusted SPY.Adjusted SHY.Adjusted
Annualized Return             0.3450000   0.09110000   0.01940000
Annualized Std Dev            0.4046000   0.18630000   0.01420000
Annualized Sharpe (Rf=0%)     0.8528000   0.48860000   1.36040000
Worst Drawdown                0.6525491   0.55189461   0.02231459
Calmar Ratio                  0.5287649   0.16498652   0.86861760
Geometric Calmar              0.1837198   0.07393135   0.84923475

For my own proprietary volatility trading strategy, a strategy which has a Calmar above 2 (interpretation: finger in the air means that you make a new equity high every six months in the worst case scenario), here are the statistics:

> CalmarRatio(stratRetsAggressive[[2]]['2011::'])
Calmar Ratio 3.448497
> geomCalmar(stratRetsAggressive[[2]]['2011::'])
Annualized Return 2.588094

Essentially, because of the nature of losses compounding, the geometric Calmar ratio will always be lower than the standard Calmar ratio, which is to be expected when dealing with the geometric nature of compounding returns.

Essentially, I hope that this gives individuals some thought about re-evaluating the Calmar Ratio.

Thanks for reading.

NOTES: registration for R/Finance 2018 is open. As usual, I’ll be giving a lightning talk, this time on volatility trading.

I am currently contracting and seek network opportunities, along with information about prospective full time roles starting in July. Those interested in my skill set can feel free to reach out to me on LinkedIn here.

Creating a Table of Monthly Returns With R and a Volatility Trading Interview

This post will cover two aspects: the first will be a function to convert daily returns into a table of monthly returns, complete with drawdowns and annual returns. The second will be an interview I had with David Lincoln (now on youtube) to talk about the events of Feb. 5, 2018, and my philosophy on volatility trading.

So, to start off with, a function that I wrote that’s supposed to mimic PerforamnceAnalytics’s table.CalendarReturns is below. What table.CalendarReturns is supposed to do is to create a month X year table of monthly returns with months across and years down. However, it never seemed to give me the output I was expecting, so I went and wrote another function.

Here’s the code for the function:


# helper functions
pastePerc <- function(x) {return(paste0(comma(x),"%"))}
rowGsub <- function(x) {x <- gsub("NA%", "NA", x);x}

calendarReturnTable <- function(rets, digits = 3, percent = FALSE) {
  # get maximum drawdown using daily returns
  dds <- apply.yearly(rets, maxDrawdown)
  # get monthly returns
  rets <- apply.monthly(rets, Return.cumulative)
  # convert to data frame with year, month, and monthly return value
  dfRets <- cbind(year(index(rets)), month(index(rets)), coredata(rets))
  # convert to data table and reshape into year x month table
  dfRets <- data.frame(dfRets)
  colnames(dfRets) <- c("Year", "Month", "Value")
  monthNames <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
  for(i in 1:length(monthNames)) {
    dfRets$Month[dfRets$Month==i] <- monthNames[i]
  dfRets <- data.table(dfRets)
  dfRets <- data.table::dcast(dfRets, Year~Month)
  # create row names and rearrange table in month order
  dfRets <- data.frame(dfRets)
  yearNames <- dfRets$Year
  rownames(dfRets) <- yearNames; dfRets$Year <- NULL
  dfRets <- dfRets[,monthNames]
  # append yearly returns and drawdowns
  yearlyRets <- apply.yearly(rets, Return.cumulative)
  dfRets$Annual <- yearlyRets
  dfRets$DD <- dds
  # convert to percentage
  if(percent) {
    dfRets <- dfRets * 100
  # round for formatting
  dfRets <- apply(dfRets, 2, round, digits)
  # paste the percentage sign
  if(percent) {
    dfRets <- apply(dfRets, 2, pastePerc)
    dfRets <- apply(dfRets, 2, rowGsub)
    dfRets <- data.frame(dfRets)
    rownames(dfRets) <- yearNames

Here’s how the output looks like.

spy <- Quandl("EOD/SPY", type='xts', start_date='1990-01-01')
spyRets <- Return.calculate(spy$Adj_Close)
calendarReturnTable(spyRets, percent = FALSE)
        Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct    Nov    Dec Annual    DD
1993  0.000  0.011  0.022 -0.026  0.027  0.004 -0.005  0.038 -0.007  0.020 -0.011  0.012  0.087 0.047
1994  0.035 -0.029 -0.042  0.011  0.016 -0.023  0.032  0.038 -0.025  0.028 -0.040  0.007  0.004 0.085
1995  0.034  0.041  0.028  0.030  0.040  0.020  0.032  0.004  0.042 -0.003  0.044  0.016  0.380 0.026
1996  0.036  0.003  0.017  0.011  0.023  0.009 -0.045  0.019  0.056  0.032  0.073 -0.024  0.225 0.076
1997  0.062  0.010 -0.044  0.063  0.063  0.041  0.079 -0.052  0.048 -0.025  0.039  0.019  0.335 0.112
1998  0.013  0.069  0.049  0.013 -0.021  0.043 -0.014 -0.141  0.064  0.081  0.056  0.065  0.287 0.190
1999  0.035 -0.032  0.042  0.038 -0.023  0.055 -0.031 -0.005 -0.022  0.064  0.017  0.057  0.204 0.117
2000 -0.050 -0.015  0.097 -0.035 -0.016  0.020 -0.016  0.065 -0.055 -0.005 -0.075 -0.005 -0.097 0.171
2001  0.044 -0.095 -0.056  0.085 -0.006 -0.024 -0.010 -0.059 -0.082  0.013  0.078  0.006 -0.118 0.288
2002 -0.010 -0.018  0.033 -0.058 -0.006 -0.074 -0.079  0.007 -0.105  0.082  0.062 -0.057 -0.216 0.330
2003 -0.025 -0.013  0.002  0.085  0.055  0.011  0.018  0.021 -0.011  0.054  0.011  0.050  0.282 0.137
2004  0.020  0.014 -0.013 -0.019  0.017  0.018 -0.032  0.002  0.010  0.013  0.045  0.030  0.107 0.075
2005 -0.022  0.021 -0.018 -0.019  0.032  0.002  0.038 -0.009  0.008 -0.024  0.044 -0.002  0.048 0.070
2006  0.024  0.006  0.017  0.013 -0.030  0.003  0.004  0.022  0.027  0.032  0.020  0.013  0.158 0.076
2007  0.015 -0.020  0.012  0.044  0.034 -0.015 -0.031  0.013  0.039  0.014 -0.039 -0.011  0.051 0.099
2008 -0.060 -0.026 -0.009  0.048  0.015 -0.084 -0.009  0.015 -0.094 -0.165 -0.070  0.010 -0.368 0.476
2009 -0.082 -0.107  0.083  0.099  0.058 -0.001  0.075  0.037  0.035 -0.019  0.062  0.019  0.264 0.271
2010 -0.036  0.031  0.061  0.015 -0.079 -0.052  0.068 -0.045  0.090  0.038  0.000  0.067  0.151 0.157
2011  0.023  0.035  0.000  0.029 -0.011 -0.017 -0.020 -0.055 -0.069  0.109 -0.004  0.010  0.019 0.186
2012  0.046  0.043  0.032 -0.007 -0.060  0.041  0.012  0.025  0.025 -0.018  0.006  0.009  0.160 0.097
2013  0.051  0.013  0.038  0.019  0.024 -0.013  0.052 -0.030  0.032  0.046  0.030  0.026  0.323 0.056
2014 -0.035  0.046  0.008  0.007  0.023  0.021 -0.013  0.039 -0.014  0.024  0.027 -0.003  0.135 0.073
2015 -0.030  0.056 -0.016  0.010  0.013 -0.020  0.023 -0.061 -0.025  0.085  0.004 -0.017  0.013 0.119
2016 -0.050 -0.001  0.067  0.004  0.017  0.003  0.036  0.001  0.000 -0.017  0.037  0.020  0.120 0.103
2017  0.018  0.039  0.001  0.010  0.014  0.006  0.021  0.003  0.020  0.024  0.031  0.012  0.217 0.026
2018  0.056 -0.031     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA  0.023 0.101

And with percentage formatting:

calendarReturnTable(spyRets, percent = TRUE)
Using 'Value' as value column. Use 'value.var' to override
         Jan      Feb     Mar     Apr     May     Jun     Jul      Aug      Sep      Oct     Nov     Dec   Annual      DD
1993  0.000%   1.067%  2.241% -2.559%  2.697%  0.367% -0.486%   3.833%  -0.726%   1.973% -1.067%  1.224%   8.713%  4.674%
1994  3.488%  -2.916% -4.190%  1.121%  1.594% -2.288%  3.233%   3.812%  -2.521%   2.843% -3.982%  0.724%   0.402%  8.537%
1995  3.361%   4.081%  2.784%  2.962%  3.967%  2.021%  3.217%   0.445%   4.238%  -0.294%  4.448%  1.573%  38.046%  2.595%
1996  3.558%   0.319%  1.722%  1.087%  2.270%  0.878% -4.494%   1.926%   5.585%   3.233%  7.300% -2.381%  22.489%  7.629%
1997  6.179%   0.957% -4.414%  6.260%  6.321%  4.112%  7.926%  -5.180%   4.808%  -2.450%  3.870%  1.910%  33.478% 11.203%
1998  1.288%   6.929%  4.876%  1.279% -2.077%  4.259% -1.351% -14.118%   6.362%   8.108%  5.568%  6.541%  28.688% 19.030%
1999  3.523%  -3.207%  4.151%  3.797% -2.287%  5.538% -3.102%  -0.518%  -2.237%   6.408%  1.665%  5.709%  20.388% 11.699%
2000 -4.979%  -1.523%  9.690% -3.512% -1.572%  1.970% -1.570%   6.534%  -5.481%  -0.468% -7.465% -0.516%  -9.730% 17.120%
2001  4.446%  -9.539% -5.599%  8.544% -0.561% -2.383% -1.020%  -5.933%  -8.159%   1.302%  7.798%  0.562% -11.752% 28.808%
2002 -0.980%  -1.794%  3.324% -5.816% -0.593% -7.376% -7.882%   0.680% -10.485%   8.228%  6.168% -5.663% -21.588% 32.968%
2003 -2.459%  -1.348%  0.206%  8.461%  5.484%  1.066%  1.803%   2.063%  -1.089%   5.353%  1.092%  5.033%  28.176% 13.725%
2004  1.977%   1.357% -1.320% -1.892%  1.712%  1.849% -3.222%   0.244%   1.002%   1.288%  4.451%  3.015%  10.704%  7.526%
2005 -2.242%   2.090% -1.828% -1.874%  3.222%  0.150%  3.826%  -0.937%   0.800%  -2.365%  4.395% -0.190%   4.827%  6.956%
2006  2.401%   0.573%  1.650%  1.263% -3.012%  0.264%  0.448%   2.182%   2.699%   3.152%  1.989%  1.337%  15.847%  7.593%
2007  1.504%  -1.962%  1.160%  4.430%  3.392% -1.464% -3.131%   1.283%   3.870%   1.357% -3.873% -1.133%   5.136%  9.925%
2008 -6.046%  -2.584% -0.903%  4.766%  1.512% -8.350% -0.899%   1.545%  -9.437% -16.519% -6.961%  0.983% -36.807% 47.592%
2009 -8.211% -10.745%  8.348%  9.935%  5.845% -0.068%  7.461%   3.694%   3.545%  -1.923%  6.161%  1.907%  26.364% 27.132%
2010 -3.634%   3.119%  6.090%  1.547% -7.945% -5.175%  6.830%  -4.498%   8.955%   3.820%  0.000%  6.685%  15.057% 15.700%
2011  2.330%   3.474%  0.010%  2.896% -1.121% -1.688% -2.000%  -5.498%  -6.945%  10.915% -0.406%  1.044%   1.888% 18.609%
2012  4.637%   4.341%  3.216% -0.668% -6.006%  4.053%  1.183%   2.505%   2.535%  -1.820%  0.566%  0.900%  15.991%  9.687%
2013  5.119%   1.276%  3.798%  1.921%  2.361% -1.336%  5.168%  -2.999%   3.168%   4.631%  2.964%  2.589%  32.307%  5.552%
2014 -3.525%   4.552%  0.831%  0.695%  2.321%  2.064% -1.344%   3.946%  -1.379%   2.355%  2.747% -0.256%  13.462%  7.273%
2015 -2.963%   5.620% -1.574%  0.983%  1.286% -2.029%  2.259%  -6.095%  -2.543%   8.506%  0.366% -1.718%   1.252% 11.910%
2016 -4.979%  -0.083%  6.724%  0.394%  1.701%  0.350%  3.647%   0.120%   0.008%  -1.734%  3.684%  2.028%  12.001% 10.306%
2017  1.789%   3.929%  0.126%  0.993%  1.411%  0.637%  2.055%   0.292%   2.014%   2.356%  3.057%  1.209%  21.700%  2.609%
2018  5.636%  -3.118%      NA      NA      NA      NA      NA       NA       NA       NA      NA      NA   2.342% 10.102%

That covers it for the function. Now, onto volatility trading. Dodging the February short volatility meltdown has, in my opinion, been one of the best out-of-sample validators for my volatility trading research. My subscriber numbers confirm it, as I’ve received 12 new subscribers this month, as individuals interested in the volatility trading space have gained a newfound respect for the risk management that my system uses. After all, it’s the down months that vindicate system traders like myself that do not employ leverage in the up times. Those interested in following my trades can subscribe here. Furthermore, recently, I was able to get a chance to speak with David Lincoln about my background, and philosophy on trading in general, and trading volatility in particular. Those interested can view the interview here.

Thanks for reading.

NOTE: I am currently interested in networking, full-time positions related to my skill set, and long-term consulting projects. Those interested in discussing professional opportunities can find me on LinkedIn after writing a note expressing their interest.

Which Implied Volatility Ratio Is Best?

This post will be about comparing a volatility signal using three different variations of implied volatility indices to predict when to enter a short volatility position.

In volatility trading, there are three separate implied volatility indices that have a somewhat long history for trading–the VIX (everyone knows this one), the VXV (more recently changed to be called the VIX3M), which is like the VIX, except for a three-month period), and the VXMT, which is the implied six-month volatility period.

This relationship gives investigation into three separate implied volatility ratios: VIX/VIX3M (aka VXV), VIX/VXMT, and VIX3M/VXMT, as predictors for entering a short (or long) volatility position.

So, let’s get the data.



VIX <- fread("", skip = 1)
VIXdates <- VIX$Date
VIX$Date <- NULL; VIX <- xts(VIX,, format = '%m/%d/%Y'))

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))

xivRets <- Return.calculate(Cl(xiv))

One quick strategy to investigate is simple–the idea that the ratio should be below 1 (I.E. contango in implied volatility term structure) and decreasing (below a moving average). So when the ratio will be below 1 (that is, with longer-term implied volatility greater than shorter-term), and the ratio will be below its 60-day moving average, the strategy will take a position in XIV.

Here’s the code to do that.

vixVix3m <- Cl(VIX)/Cl(vxv)
vixVxmt <- Cl(VIX)/Cl(vxmt)
vix3mVxmt <- Cl(vxv)/Cl(vxmt)

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"

maShort <- SMA(vixVix3m, 60)
maMed <- SMA(vixVxmt, 60)
maLong <- SMA(vix3mVxmt, 60)

sigShort <- vixVix3m < 1 & vixVix3m < maShort
sigMed <- vixVxmt < 1 & vixVxmt < maMed 
sigLong <- vix3mVxmt < 1 & vix3mVxmt < maLong 

retsShort <- lag(sigShort, 2) * xivRets 
retsMed <- lag(sigMed, 2) * xivRets 
retsLong <- lag(sigLong, 2) * xivRets

compare <- na.omit(cbind(retsShort, retsMed, retsLong))
colnames(compare) <- c("Short", "Medium", "Long")

With the following performance:


> stratStats(compare)
                              Short    Medium     Long
Annualized Return         0.5485000 0.6315000 0.638600
Annualized Std Dev        0.3874000 0.3799000 0.378900
Annualized Sharpe (Rf=0%) 1.4157000 1.6626000 1.685600
Worst Drawdown            0.5246983 0.5318472 0.335756
Calmar Ratio              1.0453627 1.1873711 1.901976
Ulcer Performance Index   3.7893478 4.6181788 5.244137

In other words, the VIX3M/VXMT sports the lowest drawdowns (by a large margin) with higher returns.

So, when people talk about which implied volatility ratio to use, I think this offers some strong evidence for the longer-out horizon as a predictor for which implied vol term structure to use. It’s also why it forms the basis of my subscription strategy.

Thanks for reading.

NOTE: I am currently seeking a full-time position (remote or in the northeast U.S.) related to my skill set demonstrated on this blog. Please message me on LinkedIn if you know of any opportunities which may benefit from my skill set.

Replicating Volatility ETN Returns From CBOE Futures

This post will demonstrate how to replicate the volatility ETNs (XIV, VXX, ZIV, VXZ) from CBOE futures, thereby allowing any individual to create synthetic ETF returns from before their inception, free of cost.

So, before I get to the actual algorithm, it depends on an update to the term structure algorithm I shared some months back.

In that algorithm, mistakenly (or for the purpose of simplicity), I used calendar days as the time to expiry, when it should have been business days, which also accounts for weekends, and holidays, which are an irritating artifact to keep track of.

So here’s the salient change, in the loop that calculates times to expiry:


masterlist <- list()
timesToExpiry <- list()
for(i in 1:length(contracts)) {
  # obtain data
  contract <- contracts[i]
  dataFile <- paste0(stem, contract, "_VX.csv")
  expiryYear <- paste0("20",substr(contract, 2, 3))
  expiryMonth <- monthMaps$monthNum[monthMaps$futureStem == substr(contract,1,1)]
  expiryDate <- dates$dates[dates$dateMon == paste(expiryYear, expiryMonth, sep="-")]
  data <- tryCatch(
    }, error = function(e){return(NULL)}
  if(!is.null(data)) {
    # create dates
    dataDates <- as.Date(data$`Trade Date`, format = '%m/%d/%Y')
    # create time to expiration xts
    toExpiry <- xts(bizdays(dataDates, expiryDate),
    colnames(toExpiry) <- contract
    timesToExpiry[[i]] <- toExpiry
    # get settlements
    settlement <- xts(data$Settle,
    colnames(settlement) <- contract
    masterlist[[i]] <- settlement

The one salient line in particular, is this:

toExpiry <- xts(bizdays(dataDates, expiryDate),

What is this bizdays function? It comes from the bizdays package in R.

There’s also the tradingHolidays.R script, which makes further use of the bizdays package. Here’s what goes on under the hood in tradingHolidays.R, for those that wish to replicate the code:

easters <- read.csv("easters.csv", header = FALSE)
easterDates <- as.Date(paste0(substr(easters$V2, 1, 6), easters$V3), format = '%m/%d/%Y')-2

nonEasters <- read.csv("nonEasterHolidays.csv", header = FALSE)
nonEasterDates <- as.Date(paste0(substr(nonEasters$V2, 1, 6), nonEasters$V3), format = '%m/%d/%Y')

weekdayNonEasters <- nonEasterDates[which(!weekdays(nonEasterDates) %in% c("Saturday", "Sunday"))]

hurricaneSandy <- as.Date(c("2012-10-29", "2012-10-30"))

holidays <- sort(c(easterDates, weekdayNonEasters, hurricaneSandy))
holidays <- holidays[holidays > as.Date("2003-12-31") & holidays < as.Date("2019-01-01")]


create.calendar("HolidaysUS", holidays, weekdays = c("saturday", "sunday"))
bizdays.options$set(default.calendar = "HolidaysUS")

There are two CSVs that I manually compiled, but will share screenshots of–they are the easter holidays (because they have to be adjusted for turning Sunday to Friday because of Easter Fridays), and the rest of the national holidays.

Here is what the easters csv looks like:


And the nonEasterHolidays, which contains New Year’s Day, MLK Jr. Day, President’s Day, Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day (along with their observed dates) nonEasterScreenshot CSV:

Furthermore, we need to adjust for the two days that equities were not trading due to Hurricane Sandy.

So then, the list of holidays looks like this:

> holidays
  [1] "2004-01-01" "2004-01-19" "2004-02-16" "2004-04-09" "2004-05-31" "2004-07-05" "2004-09-06" "2004-11-25"
  [9] "2004-12-24" "2004-12-31" "2005-01-17" "2005-02-21" "2005-03-25" "2005-05-30" "2005-07-04" "2005-09-05"
 [17] "2005-11-24" "2005-12-26" "2006-01-02" "2006-01-16" "2006-02-20" "2006-04-14" "2006-05-29" "2006-07-04"
 [25] "2006-09-04" "2006-11-23" "2006-12-25" "2007-01-01" "2007-01-02" "2007-01-15" "2007-02-19" "2007-04-06"
 [33] "2007-05-28" "2007-07-04" "2007-09-03" "2007-11-22" "2007-12-25" "2008-01-01" "2008-01-21" "2008-02-18"
 [41] "2008-03-21" "2008-05-26" "2008-07-04" "2008-09-01" "2008-11-27" "2008-12-25" "2009-01-01" "2009-01-19"
 [49] "2009-02-16" "2009-04-10" "2009-05-25" "2009-07-03" "2009-09-07" "2009-11-26" "2009-12-25" "2010-01-01"
 [57] "2010-01-18" "2010-02-15" "2010-04-02" "2010-05-31" "2010-07-05" "2010-09-06" "2010-11-25" "2010-12-24"
 [65] "2011-01-17" "2011-02-21" "2011-04-22" "2011-05-30" "2011-07-04" "2011-09-05" "2011-11-24" "2011-12-26"
 [73] "2012-01-02" "2012-01-16" "2012-02-20" "2012-04-06" "2012-05-28" "2012-07-04" "2012-09-03" "2012-10-29"
 [81] "2012-10-30" "2012-11-22" "2012-12-25" "2013-01-01" "2013-01-21" "2013-02-18" "2013-03-29" "2013-05-27"
 [89] "2013-07-04" "2013-09-02" "2013-11-28" "2013-12-25" "2014-01-01" "2014-01-20" "2014-02-17" "2014-04-18"
 [97] "2014-05-26" "2014-07-04" "2014-09-01" "2014-11-27" "2014-12-25" "2015-01-01" "2015-01-19" "2015-02-16"
[105] "2015-04-03" "2015-05-25" "2015-07-03" "2015-09-07" "2015-11-26" "2015-12-25" "2016-01-01" "2016-01-18"
[113] "2016-02-15" "2016-03-25" "2016-05-30" "2016-07-04" "2016-09-05" "2016-11-24" "2016-12-26" "2017-01-02"
[121] "2017-01-16" "2017-02-20" "2017-04-14" "2017-05-29" "2017-07-04" "2017-09-04" "2017-11-23" "2017-12-25"
[129] "2018-01-01" "2018-01-15" "2018-02-19" "2018-03-30" "2018-05-28" "2018-07-04" "2018-09-03" "2018-11-22"
[137] "2018-12-25"

So once we have a list of holidays, we use the bizdays package to set the holidays and weekends (Saturday and Sunday) as our non-business days, and use that function to calculate the correct times to expiry.

So, now that we have the updated expiry structure, we can write a function that will correctly replicate the four main volatility ETNs–XIV, VXX, ZIV, and VXZ.

Here’s the English explanation:

VXX is made up of two contracts–the front month, and the back month, and has a certain number of trading days (AKA business days) that it trades until expiry, say, 17. During that timeframe, the front month (let’s call it M1) goes from being the entire allocation of funds, to being none of the allocation of funds, as the front month contract approaches expiry. That is, as a contract approaches expiry, the second contract gradually receives more and more weight, until, at expiry of the front month contract, the second month contract contains all of the funds–just as it *becomes* the front month contract. So, say you have 17 days to expiry on the front month. At the expiry of the previous contract, the second month will have a weight of 17/17–100%, as it becomes the front month. Then, the next day, that contract, now the front month, will have a weight of 16/17 at settle, then 15/17, and so on. That numerator is called dr, and the denominator is called dt.

However, beyond this, there’s a second mechanism that’s responsible for the VXX looking like it does as compared to a basic futures contract (that is, the decay responsible for short volatility’s profits), and that is the “instantaneous” rebalancing. That is, the returns for a given day are today’s settles multiplied by yesterday’s weights, over yesterday’s settles multiplied by yesterday’s weights, minus one. That is, (S_1_t * dr/dt_t-1 + S_2_t * 1-dr/dt_t-1) / (S_1_t-1 * dr/dt_t-1 + S_2_t-1 * 1-dr/dt_t-1) – 1 (I could use a tutorial on LaTeX). So, when you move forward a day, well, tomorrow, today’s weights become t-1. Yet, when were the assets able to be rebalanced? Well, in the ETNs such as VXX and VXZ, the “hand-waving” is that it happens instantaneously. That is, the weight for the front month was 93%, the return was realized at settlement (that is, from settle to settle), and immediately after that return was realized, the front month’s weight shifts from 93%, to, say, 88%. So, say Credit Suisse (that issues these ETNs ), has $10,000 (just to keep the arithmetic and number of zeroes tolerable, obviously there are a lot more in reality) worth of XIV outstanding after immediately realizing returns, it will sell $500 of its $9300 in the front month, and immediately move them to the second month, so it will immediately go from $9300 in M1 and $700 in M2 to $8800 in M1 and $1200 in M2. When did those $500 move? Immediately, instantaneously, and if you like, you can apply Clarke’s Third Law and call it “magically”.

The only exception is the day after roll day, in which the second month simply becomes the front month as the previous front month expires, so what was a 100% weight on the second month will now be a 100% weight on the front month, so there’s some extra code that needs to be written to make that distinction.

That’s the way it works for VXX and XIV. What’s the difference for VXZ and ZIV? It’s really simple–instead of M1 and M2, VXZ uses the exact same weightings (that is, the time remaining on front month vs. how many days exist for that contract to be the front month), uses M4, M5, M6, and M7, with M4 taking dr/dt, M5 and M6 always being 1, and M7 being 1-dr/dt.

In any case, here’s the code.

syntheticXIV <- function(termStructure, expiryStructure) {
  # find expiry days
  zeroDays <- which(expiryStructure$C1 == 0)
  # dt = days in contract period, set after expiry day of previous contract
  dt <- zeroDays + 1
  dtXts <- expiryStructure$C1[dt,]
  # create dr (days remaining) and dt structure
  drDt <- cbind(expiryStructure[,1], dtXts)
  colnames(drDt) <- c("dr", "dt")
  drDt$dt <- na.locf(drDt$dt)
  # add one more to dt to account for zero day
  drDt$dt <- drDt$dt + 1
  drDt <- na.omit(drDt)
  # assign weights for front month and back month based on dr and dt
  wtC1 <- drDt$dr/drDt$dt
  wtC2 <- 1-wtC1
  # realize returns with old weights, "instantaneously" shift to new weights after realizing returns at settle
  # assumptions are a bit optimistic, I think
  valToday <- termStructure[,1] * lag(wtC1) + termStructure[,2] * lag(wtC2)
  valYesterday <- lag(termStructure[,1]) * lag(wtC1) + lag(termStructure[,2]) * lag(wtC2)
  syntheticRets <- (valToday/valYesterday) - 1
  # on the day after roll, C2 becomes C1, so reflect that in returns
  zeroes <- which(drDt$dr == 0) + 1 
  zeroRets <- termStructure[,1]/lag(termStructure[,2]) - 1
  # override usual returns with returns that reflect back month becoming front month after roll day
  syntheticRets[index(syntheticRets)[zeroes]] <- zeroRets[index(syntheticRets)[zeroes]]
  syntheticRets <- na.omit(syntheticRets)
  # vxxRets are syntheticRets
  vxxRets <- syntheticRets
  # repeat same process for vxz -- except it's dr/dt * 4th contract + 5th + 6th + 1-dr/dt * 7th contract
  vxzToday <- termStructure[,4] * lag(wtC1) + termStructure[,5] + termStructure[,6] + termStructure[,7] * lag(wtC2)
  vxzYesterday <- lag(termStructure[,4]) * lag(wtC1) + lag(termStructure[, 5]) + lag(termStructure[,6]) + lag(termStructure[,7]) * lag(wtC2)
  syntheticVxz <- (vxzToday/vxzYesterday) - 1
  # on zero expiries, next day will be equal (4+5+6)/lag(5+6+7) - 1
  zeroVxz <- (termStructure[,4] + termStructure[,5] + termStructure[,6])/
    lag(termStructure[,5] + termStructure[,6] + termStructure[,7]) - 1
  syntheticVxz[index(syntheticVxz)[zeroes]] <- zeroVxz[index(syntheticVxz)[zeroes]]
  syntheticVxz <- na.omit(syntheticVxz)
  vxzRets <- syntheticVxz
  # write out weights for actual execution
  if(last(drDt$dr!=0)) {
    print(paste("Previous front-month weight was", round(last(drDt$dr)/last(drDt$dt), 5)))
    print(paste("Front-month weight at settle today will be", round((last(drDt$dr)-1)/last(drDt$dt), 5)))
      print("Front month will be zero at end of day. Second month becomes front month.")
  } else {
    print("Previous front-month weight was zero. Second month became front month.")
    print(paste("New front month weights at settle will be", round(last(expiryStructure[,2]-1)/last(expiryStructure[,2]), 5)))
  return(list(vxxRets, vxzRets))

So, a big thank you goes out to Michael Kapler of Systematic Investor Toolbox for originally doing the replication and providing his code. My code essentially does the same thing, in, hopefully a more commented way.

So, ultimately, does it work? Well, using my updated term structure code, I can test that.

While I’m not going to paste my entire term structure code (again, available here, just update the script with my updates from this post), here’s how you’d run the new function:

> out <- syntheticXIV(termStructure, expiryStructure)
[1] "Previous front-month weight was 0.17647"
[1] "Front-month weight at settle today will be 0.11765"

And since it returns both the vxx returns and the vxz returns, we can compare them both.

compareXIV <- na.omit(cbind(xivRets, out[[1]] * -1))
colnames(compareXIV) <- c("XIV returns", "Replication returns")

With the result:


Basically, a perfect match.

Let’s do the same thing, with ZIV.

compareZIV <- na.omit(cbind(ZIVrets, out[[2]]*-1))
colnames(compareZIV) <- c("ZIV returns", "Replication returns")


So, rebuilding from the futures does a tiny bit better than the ETN. But the trajectory is largely identical.

That concludes this post. I hope it has shed some light on how these volatility ETNs work, and how to obtain them directly from the futures data published by the CBOE, which are the inputs to my term structure algorithm.

This also means that for institutions interested in trading my strategy, that they can obtain leverage to trade the futures-composite replicated variants of these ETNs, at greater volume.

Thanks for reading.

NOTES: For those interested in a retail subscription strategy to trading volatility, do not hesitate to subscribe to my volatility-trading strategy. For those interested in employing me full-time or for long-term consulting projects, I can be reached on my LinkedIn, or my email: