Data & Code
The data and code from my papers posted below may be used for non-commercial purposes free of charge. They are provided as is, without any guarantee of correctness. Please reference the relevant paper for construction details. Email me if you encounter any issues.
Does Finance Benefit Society? A Language Embedding Approach (with Manish Jha and Hongyi Liu)
Review of Financial Studies, forthcoming | Citation
Finance sentiment. An annual panel for 8 large countries covering 1870 to 2009 based on Google Books Ngram data.
Liquidity and the Strategic Value of Information (with Ohad Kadan)
Review of Finance, forthcoming | Citation
Information values, 2003-09 to 2020-12 Based on high-frequency stocks data and reported for a panel identified by day and permno.
Business News and Business Cycles (with Leland Bybee, Bryan Kelly and Dacheng Xiu)
Journal of Finance, 2024, Volume 79, Issue 5, pp. 3105–3147 | Citation
structureofnews.com An interactive website that allows users to visualize and inspect a wide variety of features from our estimated topic model. Also allows researchers to download our WSJ news attention time series for use in their own projects.
The Partisanship of Financial Regulators (with Joseph Engelberg, Matthew Henriksson and Jared Williams)
Review of Financial Studies, 2023, Vol 36, Issue 11, pp. 4373–4416 | Citation
dataverse.harvard.edu stores the data and code for the paper.
Text Selection (with Bryan Kelly and Alan Moreira)
Journal of Business & Economic Statistics, 2021, Vol 39, Issue 4, pp. 859–879 | Citation
HurdleDMR.jl Code for our HurdleDMR package for Julia. It can be called from many other programming languages like Python and R. The package allows for computationally efficient distributed estimation of the multiple hurdles over parallel processes, generating sufficient reduction projections, and inverse regressions with selected text. It allows for elastic net type convex combinations of L1 (Lasso) and L2 (Ridge) regularization as in glmnet (Friedman et al., 2010), and for concave regularization paths as in gamlr (Taddy, 2017).
Intermediary Asset Pricing: New Evidence from Many Asset Classes (with Zhiguo He and Bryan Kelly)
Journal of Financial Economics, 2017, Vol 126, Issue 1, pp. 1–35 (Lead article) | Citation
Intermediary capital risk factor, 1970Q1–2018Q3 Quarterly, monthly, and starting 2000-01-01 daily too. Also includes portfolio returns used in our cross-sectional tests. See readme.txt inside for details and replication code.
News Implied Volatility and Disaster Concerns (with Alan Moreira)
Journal of Financial Economics, 2017, Vol 123, Issue 1, pp. 137–162 | Citation
NVIX, 1889–07 to 2016–03 Also includes a decomposition into categories.
Phrase counts (ngram frequencies) of Wall Street Journal frontpage titles and abstracts. See readme.txt inside for details and replication code.