Last Post For A While, And Two Premium (Cheap) Databases

This will be my last post on this blog for an indefinite length of time. I will also include an algorithm to query Quandl’s SCF database, which is an update on my attempt to use free futures data from Quandl’s CHRIS database, which suffered from data integrity issues, even after attempts to clean it. Also provided is a small tutorial on using Quandl’s EOD database, for those that take issue with Yahoo’s daily data.

So, first off, the news…as some of you may have already heard from meeting me in person at R/Finance 2015 (which was terrific…interesting presentations, good people, good food, good drinks), effective June 8th, I will be starting employment in the New York area as a quantitative research analyst, and part of the agreement is that this blog becomes an archive of my work, so that I may focus my full energies on my full-time position. That is, it’s not going anywhere, so for those that are recent followers, you now have a great deal of time to catch up on the entire archive, which including this post, will be 62 posts. Topics covered include:

Quantstrat — its basics, and certain strategies coded using it, namely those based off of John Ehlers’s digital signal processing algorithms, along with custom order-sizing functions. A small aside on pairs trading is included as well.
Asset rotation — flexible asset allocation, elastic asset allocation, and most recently, classic asset allocation (aka the constrained critical line algorithm).
Seeking Alpha ideas — both Logical Invest and Harry Long’s work, along with Cliff Smith’s Quarterly Tactical Strategy (QTS). The Logical Invest algorithms do what they set out to do, but in some cases, are dependent on dividends to drive returns. By contrast, Harry Long’s ideas are more thought processes and proofs of concept, as opposed to complete strategies, often depending on ETFs with inception dates after the financial crisis passed (which I used some creativity for to backtest to pre-crisis timelines). I’ve also collaborated with Harry Long privately, and what I’ve seen privately has better risk/reward than anything he has revealed in public, and I find it to be impressive given the low turnover rate of such portfolios.
Volatility trading — XIV and VXX, namely, and a strategy around these two instruments that has done well out of sample.
Other statistical ideas, such as robustness heuristics and change point detection.

Topics I would have liked to have covered but didn’t roll around to:

Most Japanese trading methods — Ichimoku and Heiken Ashi, among other things. Both are in my IKTrading package, I just never rolled around to showing them off. I did cover a hammer trading system which did not perform as I would have liked it to.
Larry Connors’s mean reversion strategies — he has a book on trading ETFs, and another one he wrote before that. The material provided on this blog is sufficient for anyone to use to code those strategies.
The PortfolioAnalytics package — what quantstrat is to signal-based individual instrument trading strategies, PortfolioAnalytics is this (and a lot more) to portfolio management strategies. Although strategies such as equal-weight ranking perform well by some standards, this is only the tip of the iceberg. PortfolioAnalytics is, to my awareness, cutting edge portfolio management technology that can run the gauntlet from quick classic quadratic optimization to cutting-edge random-search global optimization portfolios (albeit those take more time to compute).

Now, onto the second part of this post, which is a pair of premium databases. They’re available from Quandl, and cost $50/month. As far as I’ve been able to tell, the futures database (SCF) data quality is vastly better than the CHRIS database, which can miss (or corrupt) chunks of data. The good news, however, is that free users can actually query these databases (or maybe all databases total, not sure) 150 times in a 60 day period. The futures script sends out 40 of these 150 queries, which may be all that is necessary if one intends to use it for some form of monthly turnover trading strategy.

Here’s the script for the SCF (futures) database. There are two caveats here:

1) The prices are on a per-contract rate. Notional values in futures trading, to my understanding, are vastly larger than one contract, to the point that getting integer quantities is no small assumption.

2) According to Alexios Ghalanos (AND THIS IS REALLY IMPORTANT), R’s GARCH guru, and one of the most prominent quants in the R/Finance community, for some providers, the Open, High, and Low values in futures data may not be based off of U.S. traditional pit trading hours in the same way that OHLC in equities/ETFs are based off of the 9:30 AM – 4:00 PM hours, but rather, extended trading hours. This means that there’s very low liquidity around open in futures, and that the high and low are also based off of these low-liquidity times as well. I am unsure if Quandl’s SCF database uses extended hours open-high-low data (low liquidity), or traditional pit hours (high liquidity), and I am hoping a representative from Quandl will clarify this in the comments section for this post. In any case, I just wanted to make sure that readers are aware of this issue.

In any case, here’s the data fetch for the Stevens Continuous Futures (SCF) database from Quandl. All of these are for front-month contracts, unadjusted prices on open interest cross. Note that in order for this script to work, you must supply quandl with your authorization token, which takes the form of something like this:


authCode <- "yourTokenHere"

quandlSCF <- function(code, authCode, from = NA, to = NA) {
  dataCode <- paste0("SCF/", code)
  out <- Quandl(dataCode, authCode = authCode)
  out <- xts(out[, -1],$Date)
  colnames(out)[4] <- "Close"
  colnames(out)[6] <- "PrevDayOpInt"
  if(! {
    out <- out[paste0(from, "::")]
  if(! {
    out <- out[paste0("::", to)]

#Front open-interest cross
from <- NA
to <- NA

CME_CL1 <- quandlSCF("CME_CL1_ON", authCode = authCode, from = from, to = to) #crude
CME_NG1 <- quandlSCF("CME_NG1_ON", authCode = authCode, from = from, to = to) #natgas
CME_HO1 <- quandlSCF("CME_HO1_ON", authCode = authCode, from = from, to = to) #heatOil
CME_RB1 <- quandlSCF("CME_RB1_ON", authCode = authCode, from = from, to = to) #RBob
ICE_B1 <- quandlSCF("ICE_B1_ON", authCode = authCode, from = from, to = to) #Brent
ICE_G1 <- quandlSCF("ICE_G1_ON", authCode = authCode, from = from, to = to) #GasOil

CME_C1 <- quandlSCF("CME_C1_ON", authCode = authCode, from = from, to = to) #Chicago Corn
CME_S1 <- quandlSCF("CME_S1_ON", authCode = authCode, from = from, to = to) #Chicago Soybeans
CME_W1 <- quandlSCF("CME_W1_ON", authCode = authCode, from = from, to = to) #Chicago Wheat
CME_SM1 <- quandlSCF("CME_SM1_ON", authCode = authCode, from = from, to = to) #Chicago Soybean Meal
CME_KW1 <- quandlSCF("CME_KW1_ON", authCode = authCode, from = from, to = to) #Kansas City Wheat
CME_BO1 <- quandlSCF("CME_BO1_ON", authCode = authCode, from = from, to = to) #Chicago Soybean Oil

ICE_SB1 <- quandlSCF("ICE_SB1_ON", authCode = authCode, from = from, to = to) #Sugar No. 11
ICE_KC1 <- quandlSCF("ICE_KC1_ON", authCode = authCode, from = from, to = to) #Coffee
ICE_CC1 <- quandlSCF("ICE_CC1_ON", authCode = authCode, from = from, to = to) #Cocoa
ICE_CT1 <- quandlSCF("ICE_CT1_ON", authCode = authCode, from = from, to = to) #Cotton

#Other Ags
CME_LC1 <- quandlSCF("CME_LC1_ON", authCode = authCode, from = from, to = to) #Live Cattle
CME_LN1 <- quandlSCF("CME_LN1_ON", authCode = authCode, from = from, to = to) #Lean Hogs

#Precious Metals
CME_GC1 <- quandlSCF("CME_GC1_ON", authCode = authCode, from = from, to = to) #Gold
CME_SI1 <- quandlSCF("CME_SI1_ON", authCode = authCode, from = from, to = to) #Silver
CME_PL1 <- quandlSCF("CME_PL1_ON", authCode = authCode, from = from, to = to) #Platinum
CME_PA1 <- quandlSCF("CME_PA1_ON", authCode = authCode, from = from, to = to) #Palladium

CME_HG1 <- quandlSCF("CME_HG1_ON", authCode = authCode, from = from, to = to) #Copper

CME_AD1 <- quandlSCF("CME_AD1_ON", authCode = authCode, from = from, to = to) #Ozzie
CME_CD1 <- quandlSCF("CME_CD1_ON", authCode = authCode, from = from, to = to) #Canadian Dollar
CME_SF1 <- quandlSCF("CME_SF1_ON", authCode = authCode, from = from, to = to) #Swiss Franc
CME_EC1 <- quandlSCF("CME_EC1_ON", authCode = authCode, from = from, to = to) #Euro
CME_BP1 <- quandlSCF("CME_BP1_ON", authCode = authCode, from = from, to = to) #Pound
CME_JY1 <- quandlSCF("CME_JY1_ON", authCode = authCode, from = from, to = to) #Yen
ICE_DX1 <- quandlSCF("ICE_DX1_ON", authCode = authCode, from = from, to = to) #Dollar Index

CME_ES1 <- quandlSCF("CME_ES1_ON", authCode = authCode, from = from, to = to) #Emini
CME_MD1 <- quandlSCF("CME_MD1_ON", authCode = authCode, from = from, to = to) #Midcap 400
CME_NQ1 <- quandlSCF("CME_NQ1_ON", authCode = authCode, from = from, to = to) #Nasdaq 100
ICE_RF1 <- quandlSCF("ICE_RF1_ON", authCode = authCode, from = from, to = to) #Russell Smallcap
CME_NK1 <- quandlSCF("CME_NK1_ON", authCode = authCode, from = from, to = to) #Nikkei

CME_FF1  <- quandlSCF("CME_FF1_ON", authCode = authCode, from = from, to = to) #30-day fed funds
CME_ED1 <- quandlSCF("CME_ED1_ON", authCode = authCode, from = from, to = to) #3 Mo. Eurodollar/TED Spread
CME_FV1  <- quandlSCF("CME_FV1_ON", authCode = authCode, from = from, to = to) #Five Year TNote
CME_TY1  <- quandlSCF("CME_TY1_ON", authCode = authCode, from = from, to = to) #Ten Year Note
CME_US1  <- quandlSCF("CME_US1_ON", authCode = authCode, from = from, to = to) #30 year bond

In this case, I just can’t give away my token. You’ll have to replace that with your own, which every account has. However, once again, individuals not subscribed to these databases need to pay $50/month.

Lastly, I’d like to show the Quandl EOD database. This is identical in functionality to Yahoo’s, but may be (hopefully!) more accurate. I have never used this database on this blog because the number one rule has always been that readers must be able to replicate all analysis for free, but for those who doubt the quality of Yahoo’s data, they may wish to look at Quandl’s EOD database.

This is how it works, with an example for SPY.

out <- Quandl("EOD/SPY", start_date="1999-12-31", end_date="2005-12-31", type = "xts")

And here’s some output.

> head(out)
             Open   High    Low  Close   Volume Dividend Split Adj_Open Adj_High  Adj_Low Adj_Close Adj_Volume
1999-12-31 146.80 147.50 146.30 146.90  3172700        0     1 110.8666 111.3952 110.4890  110.9421    3172700
2000-01-03 148.25 148.25 143.88 145.44  8164300        0     1 111.9701 111.9701 108.6695  109.8477    8164300
2000-01-04 143.50 144.10 139.60 139.80  8089800        0     1 108.3828 108.8359 105.4372  105.5882    8089800
2000-01-05 139.90 141.20 137.30 140.80 12177900        0     1 105.6631 106.6449 103.6993  106.3428   12177900
2000-01-06 139.60 141.50 137.80 137.80  6227200        0     1 105.4327 106.8677 104.0733  104.0733    6227200
2000-01-07 140.30 145.80 140.10 145.80  8066500        0     1 105.9690 110.1231 105.8179  110.1231    8066500

To note, this data is not automatically assigned to “SPY” as quantmod’s “getSymbols” function fetching from Yahoo would automatically do. Also, note that when calling the Quandl function to its EOD database, you automatically obtain both adjusted and unadjusted prices. One aspect that I am not sure is as easily done through Quandl’s API is how easy it is to adjust prices for splits but not dividends. But, for what it’s worth, there it is. So for those that take contention with the quality of Yahoo data, you may wish to look at Quandl’s EOD database for $50/month.

So…that’s it. From this point on, this blog is an archive of my work that will stay up; it’s not going anywhere. However, I won’t be updating it or answering questions on this blog. For those that have any questions about functionality, I highly recommend posting questions to the R-SIG Finance mailing list. It’s been a pleasure sharing my thoughts and work with you, and I’m glad I’ve garnered the attention of many intelligent individuals, from those that have provided me with data, to those that have built upon my work, to those that have hired me for consulting (and now a full-time) opportunity. I also hope that some of my work displayed here made it to other trading and/or asset management firms. I am very grateful for all of the feedback, comments, data, and opportunities I’ve received along the way.

Once again, thank you so much for reading. It’s been a pleasure.

12 thoughts on “Last Post For A While, And Two Premium (Cheap) Databases

  1. It’s a shame to see you go Ilya. It’s been a pleasure following your work. Thank you for all the hard work you’ve shared with us. Best of luck in the future career! — Mike

  2. hi there,

    I started to read your blog a month ago… I really appreciated it and it gave me the taste to try R.

    I have just downloaded it from CRAN and will give it a try… Hope to be able to reproduce your results and understand your coding.

    Thanks for your time and good luck for your new job !!


  3. Pingback: Quantocracy's Daily Wrap for 06/08/2015 | Quantocracy

  4. Good luck in future!
    It was tremendously helpful that you kept the code simple and gave commentary to orient the reader to the logic.
    Have you considered writing a book on quant finance in R – basically an intro to R for quants (I know you wrote on chapter for another book)?
    Perhaps your new employer would approve if they can vet it for any corporate proprietary issues.

  5. Thanks for all your posting and good work Ilha. I’ve started working with R after I’ve read EAA from your blog. Really appreciate. All the Best – Florent

  6. Good luck and a lot of success in your new job.
    Hope you earn loads in a short time so you can retire and get back to sharing your thoughts / blogging again soon😉
    No seriously – got a lot out of following, thanks for sharing …

  7. Sad news.. Thank you so much for your contributions! good luck in your new job and great post as always!

  8. Found your Ichimoku code on your Github – what happened to the Ichomoku project?

    If you do decide to post it (which I hope you do), may I suggest conducting tests in G10 FX pairs.

    More specifically, comparing a portfolio of G10 Yen crosses (EURJPY, GBPJPY, AUDJPY, USDJPY etc) to just G10 majors (EURUSD, GBPUSD, AUDUSD etc).

    The reason I ask is because JPY traders at banks and macro hedge funds closely monitor Ichimoku levels, even if they don’t believe in technical analysis at all. Most also refrain from applying the indicator to other currencies and asset classes.

    This suggested test be able to verify if such interest is warranted. It could also be an interesting way to for your blog to make an entrance to FX land.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s