The data and code from my papers posted below may be used for non-commercial purposes free of charge. They are provided as is, without any guarantee of correctness. Please reference the relevant paper for construction details. Email me if you encounter any issues.

Text Selection (with Bryan Kelly and Alan Moreira)

Journal of Business & Economic Statistics, 2021, Vol 39, Issue 4, pp. 859–879 | Citation

  • HurdleDMR.jl Code for our HurdleDMR package for Julia. It can be called from many other programming languages like Python and R. The package allows for computationally efficient distributed estimation of the multiple hurdles over parallel processes, generating sufficient reduction projections, and inverse regressions with selected text. It allows for elastic net type convex combinations of L1 (Lasso) and L2 (Ridge) regularization as in glmnet (Friedman et al., 2010), and for concave regularization paths as in gamlr (Taddy, 2017).

Intermediary Asset Pricing: New Evidence from Many Asset Classes (with Zhiguo He and Bryan Kelly)

Journal of Financial Economics, 2017, Vol 126, Issue 1, pp. 1–35 (Lead article) | Citation

News Implied Volatility and Disaster Concerns (with Alan Moreira)

Journal of Financial Economics, 2017, Vol 123, Issue 1, pp. 137–162 | Citation

  • NVIX, 1889–07 to 2016–03 Also includes a decomposition into categories.

  • Phrase counts (ngram frequencies) of Wall Street Journal frontpage titles and abstracts. See readme.txt inside for details and replication code.

Business News and Business Cycles (with Leland Bybee, Bryan Kelly and Dacheng Xiu)

Working Paper | Citation

  • An interactive website that allows users to visualize and inspect a wide variety of features from our estimated topic model. Also allows researchers to download our WSJ news attention time series for use in their own projects.

