HurdleDMR.jl is a Julia implementation of the Hurdle Distributed Multinomial Regression (HDMR), as described in:
Bryan Kelly, Asaf Manela & Alan Moreira (2021). Text Selection, Journal of Business & Economic Statistics (ungated preprint).
It includes a Julia implementation of the Distributed Multinomial Regression (DMR) model of Taddy (2015).
This tutorial explains how to use this package from Python via the PyJulia package.
First, install Julia itself. The easiest way to do that is to get the latest stable release from the official download page. An alternative is to install JuliaPro.
Once installed, open julia in a terminal (or in Juno), press ] to activate package manager and add the following packages:
pkg> add HurdleDMR GLM Lasso
See the documentation here for installation instructions.
from julia.api import Julia
jl = Julia(compiled_modules=False)
jl allows us to evaluate julia code.
The data should either be an n-by-p covars matrix or a DataFrame containing the covariates, and a (sparse) n-by-d counts matrix.
For illustration we'll analyse the State of the Union text that is roughly annual and relate it to stock market returns.
The sotu.jl script compiles stock market execss returns and the State of the Union Address texts into a matching DataFrame covarsdf and a sparse document-term matrix counts.
jl.eval('import Pkg, PyCall')
jl.eval('Pkg.add(["Lasso", "CSV", "DataDeps", "DataFrames", "Pandas", "FamaFrenchData", "TextAnalysis", "SparseArrays", "PyCall"])')
covarsdf, counts, terms = jl.eval('include("sotu.jl")')
If it throws any errors about missing packages, go ahead and add them and run again. The following code block converts the counts matrix and dataframe into python types which is probably what you start with if you work in python.
import numpy as np
from scipy import sparse
import pandas as pd
from julia import Main, Pandas, DataFrames, SparseArrays, Lasso
pycounts = sparse.csc_matrix(counts)
pycovarsdf = Pandas.DataFrame(covarsdf)
pycounts now holds the sparse matrix of bigram counts from the text:
pycounts
<92x163 sparse matrix of type '<class 'numpy.int64'>' with 3376 stored elements in Compressed Sparse Column format>
pycovarsdf contains the non-text data we'll use for this example:
pycovarsdf
| Rem | Date | President | |
|---|---|---|---|
| 0 | 29.47 | 1927 | Calvin Coolidge | 
| 1 | 35.39 | 1928 | Calvin Coolidge | 
| 2 | -19.54 | 1929 | Herbert Hoover | 
| 3 | -31.23 | 1930 | Herbert Hoover | 
| 4 | -45.11 | 1931 | Herbert Hoover | 
| ... | ... | ... | ... | 
| 87 | 0.08 | 2015 | Barack Obama | 
| 88 | 13.30 | 2016 | Barack Obama | 
| 89 | 21.51 | 2017 | Donald J. Trump | 
| 90 | -6.93 | 2018 | Donald J. Trump | 
| 91 | 28.28 | 2019 | Donald J. Trump | 
92 rows × 3 columns
jl.eval("using Distributed")
from julia.Distributed import addprocs
addprocs(4)
from julia import HurdleDMR as hd
jl.eval("@everywhere using HurdleDMR")
From now on the HurdleDMR package's functions can be called from within python using the hd alias.
The Distributed Multinomial Regression (DMR) model of Taddy (2015) is a highly scalable
approximation to the Multinomial using distributed (independent, parallel)
Poisson regressions, one for each of the d categories (columns) of a large counts matrix,
on the covarsdf.
To fit a DMR:
covars = np.asmatrix(pycovarsdf[['Rem']])
jlcounts = Main.scipyCSC_to_julia(pycounts)
m = hd.dmr(covars, jlcounts)
or with a dataframe and formula, by first converting the pandas dataframe to julia
jlcovarsdf = Main.pd_to_df(pycovarsdf)
and then fitting a DMR model that uses this DataFrame and a specified model
mf = jl.eval('@model(c ~ Rem)')
m = hd.fit(hd.DMR, mf, jlcovarsdf, jlcounts)
We can get the coefficients matrix for each variable + intercept as usual with
hd.coef(m)
array([[-5.43397777e+00, -5.16903423e+00, -5.48346231e+00,
        -5.87732380e+00, -5.87379264e+00, -5.77114439e+00,
        -3.23543281e+00, -5.38167962e+00, -4.41075981e+00,
        -5.67893791e+00, -5.60338003e+00, -5.01658441e+00,
        -3.90873484e+00, -5.72061820e+00, -5.73630459e+00,
        -5.72462434e+00, -5.66892541e+00, -5.38138811e+00,
        -4.80727621e+00, -5.66217618e+00, -5.48220062e+00,
        -5.92779117e+00, -5.84191010e+00, -5.65463036e+00,
        -5.35306350e+00, -5.72462432e+00, -5.36394515e+00,
        -5.26031876e+00, -5.25619797e+00, -5.86001501e+00,
        -4.94219746e+00, -5.49505171e+00, -5.49520004e+00,
        -5.47838157e+00, -5.55757029e+00, -5.08246018e+00,
        -5.32896413e+00, -4.32989185e+00, -5.75678995e+00,
        -5.56106908e+00, -5.31207341e+00, -6.05074799e+00,
        -5.68017255e+00, -5.15241608e+00, -5.67987417e+00,
        -3.34760668e+00, -5.78230946e+00, -5.49833280e+00,
        -5.17808067e+00, -5.01110532e+00, -5.83195551e+00,
        -5.72462437e+00, -4.52520737e+00, -5.70624518e+00,
        -4.88327336e+00, -4.68373288e+00, -5.77114439e+00,
        -5.58027939e+00, -5.90220359e+00, -4.80265229e+00,
        -5.91702811e+00, -5.55735568e+00, -6.19074690e+00,
        -3.54266727e+00, -5.51982996e+00, -4.55074630e+00,
        -5.77114437e+00, -5.86395678e+00, -5.43251989e+00,
        -5.75664743e+00, -4.72398401e+00, -5.63761299e+00,
        -5.77114439e+00, -5.44805727e+00, -5.81271043e+00,
        -5.12678737e+00, -4.78644376e+00, -5.81993452e+00,
        -4.74612379e+00, -5.45874905e+00, -4.99776842e+00,
        -5.87327975e+00, -5.47548198e+00, -4.18093783e+00,
        -5.73882656e+00, -5.49056907e+00, -5.46024068e+00,
        -5.00057348e+00, -3.95505268e+00, -5.14792331e+00,
        -5.48346231e+00, -5.08128133e+00, -5.92836924e+00,
        -5.17104296e+00, -5.55757029e+00, -5.77391562e+00,
        -5.35796560e+00, -5.63761299e+00, -4.69712240e+00,
        -5.63211106e+00, -5.42427706e+00, -5.87968708e+00,
        -4.61097420e+00, -5.93428478e+00, -5.07799720e+00,
        -5.60956664e+00, -5.01581516e+00, -5.89983014e+00,
        -4.79441853e+00, -5.49459886e+00, -4.89527587e+00,
        -5.82759147e+00, -5.41858729e+00, -5.45381989e+00,
        -5.66249212e+00, -5.17934216e+00, -4.73283340e+00,
        -5.63757711e+00, -5.77114439e+00, -5.27381269e+00,
        -5.68017261e+00, -5.21108047e+00, -5.63761299e+00,
        -5.64802682e+00, -5.46854594e+00, -6.65868772e+00,
        -5.49084712e+00, -5.46505534e+00, -5.54483789e+00,
        -3.63978204e+00, -4.12431894e+00, -4.61006815e+00,
        -5.50004335e+00, -5.63761299e+00, -5.79260468e+00,
        -4.92384653e+00, -5.64729597e+00, -5.15030690e+00,
        -5.20124622e+00, -5.42521945e+00, -5.72478380e+00,
        -5.39647414e+00, -5.38167962e+00, -5.65937576e+00,
        -5.05743712e+00, -6.06130047e+00, -5.91164510e+00,
        -5.52027898e+00, -5.80784167e+00, -4.70837445e+00,
        -3.93442042e+00, -4.64127955e+00, -5.28930630e+00,
        -5.23214784e+00, -5.82941615e+00, -5.28930630e+00,
        -5.76087053e+00, -5.41453480e+00, -4.72731232e+00,
        -5.64625805e+00, -4.92496560e+00, -5.37899827e+00,
        -4.43037269e+00],
       [-3.46050614e-02, -1.74707112e-02,  0.00000000e+00,
         1.18822899e-02,  6.71112588e-03,  0.00000000e+00,
         7.54578302e-03,  0.00000000e+00,  9.77411513e-03,
        -8.73770627e-03,  1.60489269e-02,  1.58189543e-02,
        -2.01457689e-02,  9.70362656e-03,  1.64078751e-03,
         0.00000000e+00, -2.48036108e-02,  4.15558393e-03,
        -1.27127034e-02, -2.05595816e-02,  1.41706128e-02,
         1.20342463e-02,  4.45846967e-02, -4.21705362e-03,
         4.44812164e-03,  0.00000000e+00,  1.95323399e-03,
         0.00000000e+00, -1.71163321e-02,  5.17089958e-03,
         1.43055616e-02, -1.64117460e-02, -2.06555071e-02,
         1.38451689e-02,  0.00000000e+00, -3.37120243e-02,
        -3.34766759e-02,  1.30609328e-02, -1.93213042e-02,
         9.17275115e-03, -6.77303177e-03,  2.17234198e-02,
         0.00000000e+00,  6.32453335e-03,  5.42105458e-03,
         0.00000000e+00, -4.39187897e-02,  6.28491153e-03,
         0.00000000e+00,  3.25059752e-03,  1.19867150e-02,
         0.00000000e+00,  0.00000000e+00,  3.49860505e-03,
         2.14196942e-02,  8.68757783e-03,  0.00000000e+00,
         7.41680122e-03,  2.90892879e-02,  6.00396353e-03,
         1.10462715e-02, -3.14811982e-05,  3.59782607e-02,
         0.00000000e+00,  0.00000000e+00,  1.64960398e-03,
         0.00000000e+00,  1.06444837e-02,  9.66303352e-03,
        -1.83167471e-02, -2.72837176e-02,  0.00000000e+00,
         0.00000000e+00, -6.22184174e-03,  1.01947219e-02,
         0.00000000e+00, -1.36163229e-02,  0.00000000e+00,
        -1.60397582e-02, -2.18731069e-02,  4.25066543e-03,
         6.65541741e-03,  2.25914274e-02,  0.00000000e+00,
         7.22641807e-03,  1.21274686e-02,  1.49101753e-02,
         2.02602482e-02, -3.28652121e-02,  8.40024139e-03,
         0.00000000e+00,  4.75368354e-04,  1.62268877e-02,
         1.47840622e-02,  0.00000000e+00, -8.82994898e-03,
        -1.24201105e-02,  0.00000000e+00, -1.31768296e-02,
         1.55333021e-02,  8.84734838e-03,  7.34306538e-03,
         0.00000000e+00,  3.70306953e-02,  0.00000000e+00,
         1.03525527e-02, -1.60090497e-02,  9.39900637e-03,
         2.96380748e-03,  5.86972594e-03, -9.75849129e-03,
         1.09079176e-03,  5.94202777e-04,  5.08658575e-03,
        -1.59549981e-02,  3.64227803e-03, -1.28711247e-02,
        -4.20605341e-02,  0.00000000e+00,  1.92012447e-02,
         0.00000000e+00, -9.77341390e-03,  0.00000000e+00,
         1.04211285e-02, -2.33055066e-03,  6.56984182e-02,
        -3.19752796e-02, -1.16957731e-02, -1.06568091e-02,
         8.45653180e-04,  0.00000000e+00,  1.67539352e-02,
         1.56526037e-02,  0.00000000e+00,  8.20114774e-03,
         0.00000000e+00,  3.47016417e-02,  6.09120335e-03,
         1.12375743e-02,  1.51493793e-03,  2.33525441e-05,
        -1.06374935e-02,  0.00000000e+00,  7.64148791e-03,
         6.11571120e-03,  2.86077858e-02,  2.72644850e-02,
         6.56989776e-05, -1.86475874e-03,  5.17837170e-04,
         1.39254051e-02,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  7.18571763e-03,  0.00000000e+00,
        -2.31880934e-02,  9.57928913e-06, -1.96783252e-02,
         1.22726025e-03,  7.42663230e-03,  7.35222600e-03,
         4.40557737e-03]])
By default we only return the AICc maximizing coefficients. To also get back the entire regulatrization paths, run
paths = hd.dmrpaths(covars, jlcounts)
We can now select, for example the coefficients that minimize 10-fold CV mse (takes a little longer)
jl.eval("using Lasso: MinCVmse")
segselect = jl.eval("MinCVKfold{MinCVmse}(10)")
hd.coef(paths, segselect)
array([[-5.44837099e+00, -5.23214789e+00, -5.43167163e+00,
        -5.77114434e+00, -5.81993469e+00, -5.77114439e+00,
        -3.23543281e+00, -5.38167962e+00, -4.32703043e+00,
        -5.72462437e+00, -5.59797816e+00, -5.01658441e+00,
        -3.97147974e+00, -5.72061820e+00, -5.72462437e+00,
        -5.72462434e+00, -5.66357350e+00, -5.34993092e+00,
        -4.80799960e+00, -5.72462437e+00, -5.34993085e+00,
        -5.81993455e+00, -5.81453663e+00, -5.68017261e+00,
        -5.31915926e+00, -5.72462432e+00, -5.34993090e+00,
        -5.26031876e+00, -5.25735906e+00, -5.81993460e+00,
        -4.93049270e+00, -5.55757029e+00, -5.55757029e+00,
        -5.34993092e+00, -5.55757029e+00, -5.04121497e+00,
        -5.34993092e+00, -4.32989185e+00, -5.75678995e+00,
        -5.48346231e+00, -5.34993092e+00, -6.05074799e+00,
        -5.68017255e+00, -5.15241608e+00, -5.63761301e+00,
        -3.34760668e+00, -5.72462437e+00, -5.44837111e+00,
        -5.17808067e+00, -4.98702548e+00, -5.72462437e+00,
        -5.72462437e+00, -4.52520737e+00, -5.68017263e+00,
        -4.88327336e+00, -4.61097420e+00, -5.77114439e+00,
        -5.51983016e+00, -5.90220359e+00, -4.80265229e+00,
        -5.81993464e+00, -5.55757029e+00, -6.19074690e+00,
        -3.54266727e+00, -5.51982996e+00, -4.53900074e+00,
        -5.77114437e+00, -5.77114437e+00, -5.34993094e+00,
        -5.75664743e+00, -4.75647761e+00, -5.63761299e+00,
        -5.77114439e+00, -5.48346231e+00, -5.81271043e+00,
        -5.12678737e+00, -4.78666559e+00, -5.81993452e+00,
        -4.74612379e+00, -5.45874905e+00, -4.96551919e+00,
        -5.81993460e+00, -5.28821616e+00, -4.18093783e+00,
        -5.68017268e+00, -5.38167965e+00, -5.31915938e+00,
        -5.00057348e+00, -3.97938492e+00, -5.14792331e+00,
        -5.48346231e+00, -5.07799718e+00, -5.77114441e+00,
        -5.03147719e+00, -5.55757029e+00, -5.81993455e+00,
        -5.41446944e+00, -5.63761299e+00, -4.75522381e+00,
        -5.58818465e+00, -5.34993092e+00, -5.81993455e+00,
        -4.61097420e+00, -5.87336998e+00, -5.07799720e+00,
        -5.51982996e+00, -5.07799721e+00, -5.81993455e+00,
        -4.77261554e+00, -5.44837101e+00, -4.94446576e+00,
        -5.81993455e+00, -5.41446952e+00, -5.41446950e+00,
        -5.72462437e+00, -5.15210518e+00, -4.79031513e+00,
        -5.63757711e+00, -5.77114439e+00, -5.07799720e+00,
        -5.68017261e+00, -5.26031876e+00, -5.63761299e+00,
        -5.55757028e+00, -5.48346231e+00, -6.65868772e+00,
        -5.49084712e+00, -5.51982996e+00, -5.59679097e+00,
        -3.63388327e+00, -4.12431894e+00, -4.44621896e+00,
        -5.50004335e+00, -5.63761299e+00, -5.72462437e+00,
        -4.92384653e+00, -5.62390516e+00, -5.10209476e+00,
        -5.10209481e+00, -5.41446944e+00, -5.72462436e+00,
        -5.44837099e+00, -5.38167962e+00, -5.59679095e+00,
        -5.00900431e+00, -6.03452944e+00, -5.91164510e+00,
        -5.51982994e+00, -5.81993455e+00, -4.70479301e+00,
        -3.89547353e+00, -4.64127955e+00, -5.28930630e+00,
        -5.23214784e+00, -5.77114439e+00, -5.28930630e+00,
        -5.76087053e+00, -5.41446949e+00, -4.72731232e+00,
        -5.63761296e+00, -4.86442306e+00, -5.31915923e+00,
        -4.39682622e+00],
       [-5.11702767e-14,  0.00000000e+00, -1.06011701e-02,
         1.03671427e-09,  2.08669550e-08,  0.00000000e+00,
         7.54578302e-03,  0.00000000e+00,  2.36119436e-09,
        -9.88731961e-13,  1.56114648e-02,  1.58189543e-02,
         0.00000000e+00,  9.70362656e-03,  1.34008498e-13,
         0.00000000e+00, -2.18963598e-02,  1.56309688e-12,
        -1.23845463e-02,  0.00000000e+00,  1.65745168e-09,
         1.03066364e-12,  4.32741902e-02, -4.77293627e-11,
         2.14780384e-13,  0.00000000e+00,  2.19532301e-09,
         0.00000000e+00, -1.56208951e-02,  7.36310769e-09,
         1.33022190e-02, -2.74704950e-12,  0.00000000e+00,
         5.16210204e-12,  0.00000000e+00, -1.48538071e-02,
         0.00000000e+00,  1.30609328e-02, -1.93213042e-02,
         1.92872252e-13, -8.44803733e-13,  2.17234198e-02,
         0.00000000e+00,  6.32453335e-03,  1.64264292e-09,
         0.00000000e+00,  0.00000000e+00,  1.63721367e-08,
         0.00000000e+00,  7.07412379e-09,  5.36405647e-11,
         0.00000000e+00,  0.00000000e+00,  3.49474036e-09,
         2.14196942e-02,  6.18794502e-11,  0.00000000e+00,
         2.87549353e-08,  2.90892879e-02,  6.00396353e-03,
         1.27583031e-08, -4.21018143e-12,  3.59782607e-02,
         0.00000000e+00,  0.00000000e+00,  4.53881223e-09,
         0.00000000e+00,  3.84445874e-10,  2.73785490e-09,
        -1.83167471e-02, -2.53622170e-03,  0.00000000e+00,
         0.00000000e+00, -7.73121902e-12,  1.01947219e-02,
         0.00000000e+00, -1.34950093e-02,  0.00000000e+00,
        -1.60397582e-02, -2.18731069e-02,  6.31510862e-10,
         7.42387016e-09,  6.94984502e-03,  0.00000000e+00,
         1.07525162e-08,  4.50017829e-09,  1.66093949e-08,
         2.02602482e-02,  0.00000000e+00,  8.40024139e-03,
         0.00000000e+00,  2.23431028e-10,  3.91303814e-09,
         4.61585141e-10,  0.00000000e+00, -5.42464456e-12,
        -1.93136114e-12,  0.00000000e+00,  0.00000000e+00,
         1.17496917e-02,  1.56342460e-10,  7.22896284e-13,
         0.00000000e+00,  3.37341934e-02,  0.00000000e+00,
         2.57138329e-12,  0.00000000e+00,  5.70817145e-14,
         1.27255722e-09,  1.97677073e-09, -1.44690772e-09,
         4.89242963e-13,  1.10331165e-08,  8.87211078e-09,
         0.00000000e+00,  1.24805182e-12,  0.00000000e+00,
        -4.20605341e-02,  0.00000000e+00,  6.31211148e-10,
         0.00000000e+00, -2.91198235e-11,  0.00000000e+00,
         9.98255396e-10, -2.62522244e-11,  6.56984182e-02,
        -3.19752796e-02, -2.13914705e-11, -1.06099210e-09,
         1.48467723e-10,  0.00000000e+00,  5.24254655e-10,
         1.56526037e-02,  0.00000000e+00,  2.34346189e-13,
         0.00000000e+00,  3.34086547e-02,  2.26748814e-13,
         7.11918444e-09,  3.69427304e-13,  6.35599218e-10,
        -1.72102458e-11,  0.00000000e+00,  1.01231690e-09,
         1.56921706e-09,  2.69549662e-02,  2.72644850e-02,
         1.18917184e-11, -1.93123748e-11,  7.41036058e-09,
         1.04197458e-02,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  8.49320363e-12,  0.00000000e+00,
        -2.31880934e-02,  6.42521947e-09, -1.96783252e-02,
         2.78291316e-09,  2.83263663e-09,  7.62288364e-10,
         7.83015161e-10]])
For highly sparse counts, as is often the case with text that is selected for
various reasons, the Hurdle Distributed Multinomial Regression (HDMR) model of
Kelly, Manela, and Moreira (2021), may be superior to the DMR. It approximates
a higher dispersion Multinomial using distributed (independent, parallel)
Hurdle regressions, one for each of the d categories (columns) of a large counts matrix,
on the covars. It allows a potentially different sets of covariates to explain
category inclusion ($h=1{c>0}$), and repetition ($c>0$) using the optional inpos and inzero keyword arguments.
Both the model for zeroes and for positive counts are regularized by default,
using GammaLassoPath, picking the AICc optimal segment of the regularization
path.
HDMR can be fitted:
m = hd.hdmr(covars, jlcounts)
We can get the coefficients matrix for each variable + intercept as usual though now there is a set of coefficients for the model for repetitions and for inclusions
coefspos, coefszero = hd.coef(m)
print("Repetition coefficients:\n", coefspos)
print("Inclusion coefficients:\n", coefszero)
Repetition coefficients: [[-4.10095408e+00 -2.46311402e+00 -5.69913212e+00 -5.81114099e+00 -4.56226268e+00 -5.65314535e+00 -2.83891033e+00 -4.73221532e+00 -3.87792475e+00 -4.44134321e+00 -4.13831405e+00 -4.46968430e+00 -2.67566592e+00 -6.17378610e+00 -5.04407514e+00 -5.77337923e+00 -4.28880607e+00 -3.76699723e+00 -3.91687016e+00 -4.71836406e+00 -2.75060981e+00 -4.19269956e+00 -5.06164401e+00 -4.00593360e+00 -4.61806297e+00 -5.86929691e+00 -4.94799149e+00 -4.54616070e+00 -4.86192434e+00 -5.56922079e+00 -3.13386424e+00 -3.86610236e+00 -4.69262832e+00 -3.40511128e+00 -4.77222427e+00 -2.60553464e+00 -4.68293824e+00 -3.64773029e+00 -5.37909996e+00 -4.51931403e+00 -4.90640673e+00 -5.56500032e+00 -5.61682539e+00 -5.16216290e+00 -4.80322994e+00 -2.64920181e+00 -6.15832660e+00 -4.70545239e+00 -3.85762893e+00 -4.35065915e+00 -4.75062052e+00 -4.59123707e+00 -3.86745469e+00 -5.00864540e+00 -3.01405000e+00 -3.17470691e+00 -4.44428881e+00 -5.03173570e+00 -4.97399211e+00 -3.76643341e+00 -6.01534382e+00 -4.56666033e+00 -7.18881661e+00 -2.44788021e+00 -5.91079664e+00 -3.87810599e+00 -4.82742445e+00 0.00000000e+00 -5.55060042e+00 -5.86943072e+00 -3.41612793e+00 -5.85594178e+00 -6.17690623e+00 -3.91905703e+00 -5.91799740e+00 -4.30887974e+00 -3.99947493e+00 -5.37450650e+00 -3.95288236e+00 -6.25674225e+00 -4.31397317e+00 -5.75364589e+00 -3.17260422e+00 -3.44627696e+00 -4.82487927e+00 -4.44154577e+00 -4.54391387e+00 -3.85500437e+00 -3.20316983e+00 -5.33888660e+00 -5.32706678e+00 -4.87328710e+00 -5.77971643e+00 -4.54964988e+00 -5.11649566e+00 -4.45289275e+00 -4.54737992e+00 -5.18321023e+00 -3.41062601e+00 -4.93685264e+00 -3.98785742e+00 -4.76927120e+00 -4.20206794e+00 -4.90967838e+00 -5.36255551e+00 -3.81086054e+00 -4.29835894e+00 -4.76046306e+00 -4.22398831e+00 -5.39969649e+00 -5.31497770e+00 -4.52828914e+00 -4.98185347e+00 -4.79440398e+00 -3.56777058e+00 -4.33795665e+00 -3.52587420e+00 -5.36688795e+00 -5.78022270e+00 -4.22252764e+00 -5.01617547e+00 -4.44485471e+00 -4.63634580e+00 -5.85315988e+00 -4.39091558e+00 -7.94204232e+00 -6.29954826e+00 -5.22280057e+00 -3.70348704e+00 -3.07611499e+00 -3.15787984e+00 -2.88282969e+00 -4.54007813e+00 -4.56580470e+00 -4.07546697e+00 -3.96898971e+00 -4.33892021e+00 -4.12850098e+00 -5.13836202e+00 -4.58166060e+00 -4.94441633e+00 -4.01942427e+00 -3.94093435e+00 0.00000000e+00 -4.50822799e+00 -5.65716116e+00 -3.79361101e+00 -2.99808459e+00 -5.00720111e+00 -3.63852335e+00 -3.01000047e+00 -3.77152285e+00 -4.40758993e+00 -4.91732312e+00 -3.01996764e+00 -3.81827965e+00 -5.19695206e+00 -4.20262645e+00 -3.97463660e+00 -5.28172697e+00 -4.68035784e+00 -4.97128452e+00 -3.99263610e+00] [-2.71333595e-02 -3.28806480e-02 -5.02005458e-02 0.00000000e+00 0.00000000e+00 -6.01720568e-03 1.33695994e-02 3.26247040e-02 2.04081290e-02 0.00000000e+00 9.13033462e-03 3.77466100e-02 -2.15231146e-03 0.00000000e+00 3.22352891e-02 9.95329677e-03 -4.85996540e-03 0.00000000e+00 -8.54519268e-03 -1.40169869e-03 0.00000000e+00 2.06323265e-02 6.86595712e-02 0.00000000e+00 3.35917302e-02 0.00000000e+00 -2.85472588e-03 -2.30781437e-02 0.00000000e+00 5.04232303e-02 -2.85224542e-05 -1.77081455e-02 -4.49159798e-02 0.00000000e+00 0.00000000e+00 -4.94288533e-12 -4.21584908e-02 9.98237493e-03 -1.52723589e-02 1.86236245e-02 3.89069632e-02 3.95232600e-02 3.99215298e-02 1.26144485e-02 3.26535138e-02 5.57072233e-03 -1.12790797e-01 2.44251569e-02 0.00000000e+00 2.41438293e-02 4.41021617e-02 8.08761361e-03 0.00000000e+00 2.26342728e-02 9.76240824e-03 1.41058627e-02 -1.08787244e-02 4.48834529e-02 4.56046526e-02 -4.73719159e-03 0.00000000e+00 0.00000000e+00 6.80897861e-02 2.79182201e-03 0.00000000e+00 3.00340273e-02 0.00000000e+00 0.00000000e+00 2.36599584e-02 -7.28436208e-02 -2.33082954e-02 -8.22989537e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 8.56047107e-03 0.00000000e+00 -4.45510416e-03 -5.33048132e-02 0.00000000e+00 3.83225504e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.35823704e-02 2.77907293e-02 9.07716625e-03 -2.60128118e-01 2.97943718e-02 0.00000000e+00 0.00000000e+00 3.39521965e-02 4.38357729e-02 0.00000000e+00 0.00000000e+00 -1.96450414e-02 -9.70205907e-03 9.72862458e-03 6.49043589e-02 1.28865863e-02 0.00000000e+00 3.48550860e-03 6.60302368e-02 -2.58503877e-02 1.98217716e-02 0.00000000e+00 0.00000000e+00 1.37111491e-02 5.51622639e-02 -1.83966702e-02 0.00000000e+00 4.04861396e-02 3.07504933e-02 0.00000000e+00 4.09981319e-02 -1.67959239e-02 -6.32808368e-02 3.04990563e-02 3.95120056e-02 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 2.13231489e-01 -1.50072660e-01 -4.57501721e-02 3.05141185e-04 2.63336117e-03 1.30499186e-02 1.92136576e-02 3.22952670e-02 -2.23364121e-03 1.64447025e-02 -2.83709217e-04 3.43019083e-02 3.91092598e-02 5.03664059e-02 3.09465829e-02 -8.79465378e-03 -2.41938142e-02 0.00000000e+00 0.00000000e+00 5.22022284e-03 3.16578887e-02 1.12634616e-03 -3.61652014e-02 1.29618508e-02 1.36902615e-03 1.75686245e-02 6.90542167e-04 0.00000000e+00 0.00000000e+00 -3.34069290e-02 1.54168355e-02 -3.88999835e-02 1.13899729e-02 0.00000000e+00 1.08444937e-02 2.03516093e-02 1.63499657e-02 2.50923919e-02]] Inclusion coefficients: [[-5.91841847e-01 -9.31126067e-01 3.30163123e-02 -3.77458483e-01 -4.78606926e-01 -6.75867641e-02 3.20423878e+00 2.05630060e-01 9.63012435e-01 -5.91841847e-01 -9.56833515e-01 3.82168936e-01 3.95854018e-01 -3.48163357e-02 -3.28946838e-01 -1.62505868e-01 -7.90407397e-01 -3.45020538e-01 4.29928449e-01 -6.87688948e-01 -1.11938166e+00 -7.90407397e-01 -4.94693837e-01 -8.53276186e-01 1.08858988e-01 -1.60684409e-01 -5.33873182e-02 8.17912836e-02 2.61762473e-01 -3.58464942e-01 -1.01712931e-01 -8.79982841e-01 -4.04644809e-01 -5.71433231e-01 -3.60035703e-01 -2.63897256e+00 -5.01823012e-01 1.12053227e+00 -5.01823012e-01 -3.23351926e-01 2.01881344e-01 -6.42878566e-01 -6.96929692e-02 5.15354465e-01 -5.89831278e-01 1.88071328e+00 -5.91841847e-01 -4.76079700e-02 8.70457369e-02 4.29928449e-01 -7.90407397e-01 -5.13620519e-01 6.41181288e-01 -4.92800564e-01 1.56031454e-01 1.89984203e-02 -6.87688948e-01 -1.16565650e-01 -7.77302257e-01 7.24412968e-01 -3.59359369e-01 -2.59758402e-01 -5.07006121e-01 1.02711823e+00 1.83309971e-01 3.75288433e-01 -6.87688948e-01 -5.68127586e-02 2.90205813e-01 -5.91841847e-01 -3.80369714e-02 9.82223266e-02 -2.82463955e-02 -7.97758283e-01 -3.11282917e-01 4.26550516e-01 7.38834750e-01 -5.01823012e-01 3.36293262e-01 2.36371320e-01 3.62083912e-01 -3.57443839e-01 -8.28000484e-01 1.17665966e+00 -3.43740311e-01 -2.72443319e-01 -9.86041419e-02 3.69808774e-01 -1.19962987e+00 6.80455051e-01 1.86824707e-01 5.29716802e-01 -3.48396830e-01 -4.53819273e-02 -1.56132658e-01 -6.87688948e-01 -4.43561800e-02 -1.73814721e-01 3.61450576e-01 -2.80644135e-01 -7.27424664e-01 -6.09050666e-01 9.04372774e-01 -1.02258418e+00 6.57775656e-01 -1.07491448e+00 2.74600259e-01 -5.58506454e-01 7.63512908e-01 -7.71711840e-03 1.02821704e+00 -4.29192605e-01 1.20701880e-01 -2.29122070e-01 -1.47397149e+00 -9.64906188e-02 2.38188103e-03 -4.43590479e-01 -1.46068845e-01 -1.34967552e-01 -2.47824360e-01 -4.78636641e-02 -8.28258362e-02 2.27792646e-02 5.48652686e-02 -1.96855081e+00 -5.01823012e-01 -2.86700679e-01 -3.71486190e-01 1.77456500e+00 7.88185242e-01 8.29912662e-02 -4.06471724e-01 -4.68429608e-01 -5.91841847e-01 1.90184638e-01 -4.90255034e-01 -5.20577514e-02 4.29545023e-01 -2.55832065e-01 -8.16989719e-01 -2.11043435e-01 -2.68985447e-01 2.25612230e-01 2.11492750e-01 -5.82382511e-01 -7.77381913e-01 -1.10102479e+00 -5.01823012e-01 8.47017924e-01 6.58170399e-01 2.56171830e-01 -4.65523861e-02 3.31805533e-01 -1.12500898e+00 -5.01823012e-01 -5.91841847e-01 -2.43609690e-01 7.23725018e-01 -2.97698794e-01 7.07424066e-01 1.54299606e-01 1.19445400e+00] [ 0.00000000e+00 3.29779223e-03 -1.66745479e-03 1.81565217e-02 6.53608077e-03 -1.70625569e-02 0.00000000e+00 -1.72906588e-02 7.36779024e-03 0.00000000e+00 3.02088496e-02 1.07929971e-02 -9.62277857e-03 5.76823966e-03 -1.20219116e-02 -2.73668176e-03 0.00000000e+00 1.00767310e-03 0.00000000e+00 0.00000000e+00 9.83956611e-03 0.00000000e+00 3.23025196e-02 -6.18482466e-03 -3.14641152e-03 -2.96057701e-03 1.36333072e-02 5.80510470e-05 -2.98175426e-02 -7.43239316e-03 1.22006959e-02 -2.58683753e-03 -1.42210288e-03 1.49099863e-02 -7.20808444e-03 6.18734656e-04 0.00000000e+00 1.61827508e-02 0.00000000e+00 6.78232089e-03 -1.66224770e-02 2.06121960e-02 -1.66527502e-02 7.96569393e-03 9.02843309e-03 -1.05287293e-03 0.00000000e+00 7.31237175e-05 -8.64174230e-03 0.00000000e+00 0.00000000e+00 1.33200122e-03 1.07123841e-02 7.90921199e-03 5.23518022e-03 1.27192054e-02 0.00000000e+00 -8.85786396e-03 2.41794615e-02 1.67735893e-03 1.02044348e-02 6.29063184e-05 3.31516141e-02 -3.46849370e-05 -4.73467696e-03 0.00000000e+00 0.00000000e+00 7.96374761e-03 3.27210413e-03 0.00000000e+00 -9.97663270e-03 -1.02571811e-02 -1.14771201e-02 8.41077178e-04 1.86013674e-02 -1.41160012e-02 -2.98462810e-02 0.00000000e+00 -9.30096521e-03 -2.34203755e-02 7.37610267e-03 2.37279957e-03 2.12096734e-02 -1.20387838e-03 8.64365702e-04 1.53608031e-02 5.56332541e-03 1.70796076e-02 4.80632500e-03 1.01914238e-03 -1.40703233e-02 7.43116484e-04 9.18624215e-03 6.83378373e-03 -3.52591627e-03 0.00000000e+00 -2.99926423e-04 -1.37475335e-03 -2.33772021e-02 -6.98959909e-03 2.68002511e-02 1.07809713e-02 -2.34578091e-03 2.80651681e-02 3.58579887e-03 2.45148456e-02 -1.84273843e-02 1.38190870e-02 -2.92692069e-03 -4.74678937e-03 -1.19447281e-02 1.39602546e-03 -1.37543029e-02 4.71330141e-03 0.00000000e+00 -1.19379181e-02 8.41722231e-03 -3.03029888e-02 -1.63132257e-02 1.51629519e-02 -1.32548345e-03 -8.53780405e-03 -1.42380055e-02 1.23713468e-02 -4.34467414e-03 3.68819945e-02 0.00000000e+00 3.03460948e-03 -2.13112826e-02 0.00000000e+00 -2.16154798e-05 -7.94079010e-05 1.43799594e-02 5.52591143e-03 0.00000000e+00 7.73973308e-03 3.19946367e-02 -7.94211681e-03 4.42215848e-05 -3.88533586e-04 2.03851855e-02 -5.96005535e-03 -8.66754312e-03 4.07420933e-03 1.65404182e-02 2.79788296e-02 2.41851958e-02 1.82855048e-02 0.00000000e+00 -1.37592725e-02 2.85405153e-02 6.85958270e-03 -4.76302409e-05 -8.67993400e-03 1.03411925e-02 0.00000000e+00 0.00000000e+00 -1.82784181e-03 -2.69919272e-02 4.19143585e-03 8.97167168e-03 5.41528199e-03 -3.36360806e-03]]
By default we only return the AICc maximizing coefficients. To get the coefficients that minimize say the BIC criterion, run
paths = hd.hdmrpaths(covars, jlcounts)
coefspos, coefszero = hd.coef(paths, Lasso.MinBIC())
coefspos # repetition coefs
array([[-4.10095408e+00, -2.46311402e+00, -5.69913212e+00,
        -5.81114099e+00, -4.56226268e+00, -5.65314535e+00,
        -2.83891033e+00, -4.73221532e+00, -3.87792475e+00,
        -4.44134321e+00, -4.13831405e+00, -4.46968430e+00,
        -2.67566592e+00, -6.17378610e+00, -5.04407514e+00,
        -5.77337923e+00, -4.28880607e+00, -3.76699723e+00,
        -3.91687016e+00, -4.71836406e+00, -2.75060981e+00,
        -4.19269956e+00, -5.06164401e+00, -4.04950441e+00,
        -4.61806297e+00, -5.86929691e+00, -4.94799149e+00,
        -4.50407402e+00, -4.86192434e+00, -5.56922079e+00,
        -3.13386424e+00, -3.86610236e+00, -4.69262832e+00,
        -3.40511128e+00, -4.77222427e+00, -4.03006730e+00,
        -4.68293824e+00, -3.64773029e+00, -5.37909996e+00,
        -4.51931403e+00, -4.90640673e+00, -5.56500032e+00,
        -5.61682539e+00, -5.16216290e+00, -4.80322994e+00,
        -2.64920181e+00, -6.15832660e+00, -4.70545239e+00,
        -3.85762893e+00, -4.35065915e+00, -4.75062052e+00,
        -4.59123707e+00, -3.86745469e+00, -5.00864540e+00,
        -3.01405000e+00, -3.17470691e+00, -4.44428881e+00,
        -5.03173570e+00, -4.97399211e+00, -3.76643341e+00,
        -6.01534382e+00, -4.56666033e+00, -7.18881661e+00,
        -2.44788021e+00, -5.91079664e+00, -3.87810599e+00,
        -4.82742445e+00,  0.00000000e+00, -5.55060042e+00,
        -5.86943072e+00, -3.41612793e+00, -5.85594178e+00,
        -6.17690623e+00, -3.91905703e+00, -5.91799740e+00,
        -4.30887974e+00, -3.99947493e+00, -5.37450650e+00,
        -3.95288236e+00, -6.25674225e+00, -4.31397317e+00,
        -5.75364589e+00, -3.17260422e+00, -3.44627696e+00,
        -4.82487927e+00, -4.44154577e+00, -4.54391387e+00,
        -3.85500437e+00, -3.20316983e+00, -5.33888660e+00,
        -5.32706678e+00, -4.87328710e+00, -5.77971643e+00,
        -4.54964988e+00, -5.11649566e+00, -4.45289275e+00,
        -4.54737992e+00, -5.18321023e+00, -3.41062601e+00,
        -4.93685264e+00, -3.98785742e+00, -4.76927120e+00,
        -4.20206794e+00, -4.90967838e+00, -5.36255551e+00,
        -3.81086054e+00, -4.29835894e+00, -4.76046306e+00,
        -4.22398831e+00, -5.39969649e+00, -5.31497770e+00,
        -4.52828914e+00, -4.98185347e+00, -4.79440398e+00,
        -3.56777058e+00, -4.33795665e+00, -3.50449695e+00,
        -5.36688795e+00, -5.78022270e+00, -4.22252764e+00,
        -5.01617547e+00, -4.44485471e+00, -4.63634580e+00,
        -5.85315988e+00, -4.39091558e+00, -7.94204232e+00,
        -6.29954826e+00, -5.22280057e+00, -3.70348704e+00,
        -3.07611499e+00, -3.15787984e+00, -2.88282969e+00,
        -4.54007813e+00, -4.56580470e+00, -4.07546697e+00,
        -3.96898971e+00, -4.33892021e+00, -4.12850098e+00,
        -5.13836202e+00, -4.58166060e+00, -4.94441633e+00,
        -4.01942427e+00, -3.94093435e+00,  0.00000000e+00,
        -4.50822799e+00, -5.65716116e+00, -3.79361101e+00,
        -2.99808459e+00, -5.00720111e+00, -3.63852335e+00,
        -3.01000047e+00, -3.77152285e+00, -4.40758993e+00,
        -4.91732312e+00, -3.01996764e+00, -3.81827965e+00,
        -5.19695206e+00, -4.20262645e+00, -3.97463660e+00,
        -5.28172697e+00, -4.68035784e+00, -4.97128452e+00,
        -3.99263610e+00],
       [-2.71333595e-02, -3.28806480e-02, -5.02005458e-02,
         0.00000000e+00,  0.00000000e+00, -6.01720568e-03,
         1.33695994e-02,  3.26247040e-02,  2.04081290e-02,
         0.00000000e+00,  9.13033462e-03,  3.77466100e-02,
        -2.15231146e-03,  0.00000000e+00,  3.22352891e-02,
         9.95329677e-03, -4.85996540e-03,  0.00000000e+00,
        -8.54519268e-03, -1.40169869e-03,  0.00000000e+00,
         2.06323265e-02,  6.86595712e-02,  3.33672914e-02,
         3.35917302e-02,  0.00000000e+00, -2.85472588e-03,
         0.00000000e+00,  0.00000000e+00,  5.04232303e-02,
        -2.85224542e-05, -1.77081455e-02, -4.49159798e-02,
         0.00000000e+00,  0.00000000e+00, -2.29893229e-01,
        -4.21584908e-02,  9.98237493e-03, -1.52723589e-02,
         1.86236245e-02,  3.89069632e-02,  3.95232600e-02,
         3.99215298e-02,  1.26144485e-02,  3.26535138e-02,
         5.57072233e-03, -1.12790797e-01,  2.44251569e-02,
         0.00000000e+00,  2.41438293e-02,  4.41021617e-02,
         8.08761361e-03,  0.00000000e+00,  2.26342728e-02,
         9.76240824e-03,  1.41058627e-02, -1.08787244e-02,
         4.48834529e-02,  4.56046526e-02, -4.73719159e-03,
         0.00000000e+00,  0.00000000e+00,  6.80897861e-02,
         2.79182201e-03,  0.00000000e+00,  3.00340273e-02,
         0.00000000e+00,  0.00000000e+00,  2.36599584e-02,
        -7.28436208e-02, -2.33082954e-02, -8.22989537e-02,
         0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  8.56047107e-03,  0.00000000e+00,
        -4.45510416e-03, -5.33048132e-02,  0.00000000e+00,
         3.83225504e-02,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  1.35823704e-02,  2.77907293e-02,
         9.07716625e-03, -2.60128118e-01,  2.97943718e-02,
         0.00000000e+00,  0.00000000e+00,  3.39521965e-02,
         4.38357729e-02,  0.00000000e+00,  0.00000000e+00,
        -1.96450414e-02, -9.70205907e-03,  9.72862458e-03,
         6.49043589e-02,  1.28865863e-02,  0.00000000e+00,
         3.48550860e-03,  6.60302368e-02, -2.58503877e-02,
         1.98217716e-02,  0.00000000e+00,  0.00000000e+00,
         1.37111491e-02,  5.51622639e-02, -1.83966702e-02,
         0.00000000e+00,  4.04861396e-02,  3.07504933e-02,
         0.00000000e+00,  4.09981319e-02,  0.00000000e+00,
        -6.32808368e-02,  3.04990563e-02,  3.95120056e-02,
         0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  0.00000000e+00,  2.13231489e-01,
        -1.50072660e-01, -4.57501721e-02,  3.05141185e-04,
         2.63336117e-03,  1.30499186e-02,  1.92136576e-02,
         3.22952670e-02, -2.23364121e-03,  1.64447025e-02,
        -2.83709217e-04,  3.43019083e-02,  3.91092598e-02,
         5.03664059e-02,  3.09465829e-02, -8.79465378e-03,
        -2.41938142e-02,  0.00000000e+00,  0.00000000e+00,
         5.22022284e-03,  3.16578887e-02,  1.12634616e-03,
        -3.61652014e-02,  1.29618508e-02,  1.36902615e-03,
         1.75686245e-02,  6.90542167e-04,  0.00000000e+00,
         0.00000000e+00, -3.34069290e-02,  1.54168355e-02,
        -3.88999835e-02,  1.13899729e-02,  0.00000000e+00,
         1.08444937e-02,  2.03516093e-02,  1.63499657e-02,
         2.50923919e-02]])
coefszero # inclusion coefs
array([[-5.91841847e-01, -9.31126067e-01,  3.30163123e-02,
        -3.77458483e-01, -4.78606926e-01, -6.75867641e-02,
         3.20423878e+00,  2.05630060e-01,  9.63012435e-01,
        -5.91841847e-01, -9.56833515e-01,  3.82168936e-01,
         3.95854018e-01, -3.48163357e-02, -3.28946838e-01,
        -1.62505868e-01, -7.90407397e-01, -3.45020538e-01,
         4.29928449e-01, -6.87688948e-01, -1.11938166e+00,
        -7.90407397e-01, -4.94693837e-01, -8.53276186e-01,
         1.08858988e-01, -1.60684409e-01, -5.33873182e-02,
         8.17912836e-02,  2.61762473e-01, -3.58464942e-01,
        -1.01712931e-01, -8.79982841e-01, -4.04644809e-01,
        -5.71433231e-01, -3.60035703e-01, -2.63897256e+00,
        -5.01823012e-01,  1.25918958e+00, -5.01823012e-01,
        -3.23351926e-01,  2.01881344e-01, -6.42878566e-01,
        -6.96929692e-02,  5.15354465e-01, -5.89831278e-01,
         1.88071328e+00, -5.91841847e-01, -4.76079700e-02,
         8.70457369e-02,  4.29928449e-01, -7.90407397e-01,
        -5.13620519e-01,  6.41181288e-01, -4.92800564e-01,
         1.56031454e-01,  1.89984203e-02, -6.87688948e-01,
        -1.16565650e-01, -7.77302257e-01,  7.24412968e-01,
        -3.59359369e-01, -2.59758402e-01, -5.07006121e-01,
         1.02711823e+00,  1.83309971e-01,  3.75288433e-01,
        -6.87688948e-01, -5.68127586e-02,  2.90205813e-01,
        -5.91841847e-01, -3.80369714e-02,  9.82223266e-02,
        -2.82463955e-02, -7.97758283e-01, -3.11282917e-01,
         4.26550516e-01,  7.38834750e-01, -5.01823012e-01,
         3.36293262e-01,  2.36371320e-01,  3.62083912e-01,
        -3.57443839e-01, -8.28000484e-01,  1.17665966e+00,
        -3.43740311e-01, -2.72443319e-01, -9.86041419e-02,
         3.69808774e-01, -1.19962987e+00,  6.80455051e-01,
         1.86824707e-01,  5.29716802e-01, -3.48396830e-01,
        -4.53819273e-02, -1.56132658e-01, -6.87688948e-01,
        -4.43561800e-02, -1.73814721e-01,  3.61450576e-01,
        -2.80644135e-01, -7.27424664e-01, -6.09050666e-01,
         9.04372774e-01, -1.02258418e+00,  6.57775656e-01,
        -1.07491448e+00,  2.74600259e-01, -5.58506454e-01,
         7.63512908e-01, -7.71711840e-03,  1.02821704e+00,
        -4.29192605e-01,  1.20701880e-01, -2.29122070e-01,
        -1.47397149e+00, -9.64906188e-02,  2.38188103e-03,
        -5.91841847e-01, -1.46068845e-01, -1.34967552e-01,
        -2.47824360e-01, -4.78636641e-02, -8.28258362e-02,
         2.27792646e-02,  5.48652686e-02, -1.96855081e+00,
        -5.01823012e-01, -2.86700679e-01, -5.01823012e-01,
         1.77456500e+00,  7.88185242e-01,  8.29912662e-02,
        -4.06471724e-01, -4.68429608e-01, -5.91841847e-01,
         1.90184638e-01, -4.90255034e-01, -5.20577514e-02,
         4.29545023e-01, -2.55832065e-01, -8.16989719e-01,
        -2.11043435e-01, -2.68985447e-01,  2.25612230e-01,
         2.11492750e-01, -5.82382511e-01, -7.77381913e-01,
        -1.10102479e+00, -5.01823012e-01,  8.47017924e-01,
         6.58170399e-01,  2.56171830e-01, -4.65523861e-02,
         3.31805533e-01, -1.12500898e+00, -5.01823012e-01,
        -5.91841847e-01, -2.43609690e-01,  7.23725018e-01,
        -2.97698794e-01,  7.07424066e-01,  1.54299606e-01,
         1.19445400e+00],
       [ 0.00000000e+00,  3.29779223e-03, -1.66745479e-03,
         1.81565217e-02,  6.53608077e-03, -1.70625569e-02,
         0.00000000e+00, -1.72906588e-02,  7.36779024e-03,
         0.00000000e+00,  3.02088496e-02,  1.07929971e-02,
        -9.62277857e-03,  5.76823966e-03, -1.20219116e-02,
        -2.73668176e-03,  0.00000000e+00,  1.00767310e-03,
         0.00000000e+00,  0.00000000e+00,  9.83956611e-03,
         0.00000000e+00,  3.23025196e-02, -6.18482466e-03,
        -3.14641152e-03, -2.96057701e-03,  1.36333072e-02,
         5.80510470e-05, -2.98175426e-02, -7.43239316e-03,
         1.22006959e-02, -2.58683753e-03, -1.42210288e-03,
         1.49099863e-02, -7.20808444e-03,  6.18734656e-04,
         0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         6.78232089e-03, -1.66224770e-02,  2.06121960e-02,
        -1.66527502e-02,  7.96569393e-03,  9.02843309e-03,
        -1.05287293e-03,  0.00000000e+00,  7.31237175e-05,
        -8.64174230e-03,  0.00000000e+00,  0.00000000e+00,
         1.33200122e-03,  1.07123841e-02,  7.90921199e-03,
         5.23518022e-03,  1.27192054e-02,  0.00000000e+00,
        -8.85786396e-03,  2.41794615e-02,  1.67735893e-03,
         1.02044348e-02,  6.29063184e-05,  3.31516141e-02,
        -3.46849370e-05, -4.73467696e-03,  0.00000000e+00,
         0.00000000e+00,  7.96374761e-03,  3.27210413e-03,
         0.00000000e+00, -9.97663270e-03, -1.02571811e-02,
        -1.14771201e-02,  8.41077178e-04,  1.86013674e-02,
        -1.41160012e-02, -2.98462810e-02,  0.00000000e+00,
        -9.30096521e-03, -2.34203755e-02,  7.37610267e-03,
         2.37279957e-03,  2.12096734e-02, -1.20387838e-03,
         8.64365702e-04,  1.53608031e-02,  5.56332541e-03,
         1.70796076e-02,  4.80632500e-03,  1.01914238e-03,
        -1.40703233e-02,  7.43116484e-04,  9.18624215e-03,
         6.83378373e-03, -3.52591627e-03,  0.00000000e+00,
        -2.99926423e-04, -1.37475335e-03, -2.33772021e-02,
        -6.98959909e-03,  2.68002511e-02,  1.07809713e-02,
        -2.34578091e-03,  2.80651681e-02,  3.58579887e-03,
         2.45148456e-02, -1.84273843e-02,  1.38190870e-02,
        -2.92692069e-03, -4.74678937e-03, -1.19447281e-02,
         1.39602546e-03, -1.37543029e-02,  4.71330141e-03,
         0.00000000e+00, -1.19379181e-02,  8.41722231e-03,
         0.00000000e+00, -1.63132257e-02,  1.51629519e-02,
        -1.32548345e-03, -8.53780405e-03, -1.42380055e-02,
         1.23713468e-02, -4.34467414e-03,  3.68819945e-02,
         0.00000000e+00,  3.03460948e-03,  0.00000000e+00,
         0.00000000e+00, -2.16154798e-05, -7.94079010e-05,
         1.43799594e-02,  5.52591143e-03,  0.00000000e+00,
         7.73973308e-03,  3.19946367e-02, -7.94211681e-03,
         4.42215848e-05, -3.88533586e-04,  2.03851855e-02,
        -5.96005535e-03, -8.66754312e-03,  4.07420933e-03,
         1.65404182e-02,  2.79788296e-02,  2.41851958e-02,
         1.82855048e-02,  0.00000000e+00, -1.37592725e-02,
         2.85405153e-02,  6.85958270e-03, -4.76302409e-05,
        -8.67993400e-03,  1.03411925e-02,  0.00000000e+00,
         0.00000000e+00, -1.82784181e-03, -2.69919272e-02,
         4.19143585e-03,  8.97167168e-03,  5.41528199e-03,
        -3.36360806e-03]])
A sufficient reduction projection summarizes the counts, much like a sufficient
statistic, and is useful for reducing the d dimensional counts in a potentially
much lower dimension matrix z.
To get a sufficient reduction projection in direction of Rem for the above
example
hd.srproj(m,jlcounts,1,1)
array([[ 2.81751428e-02, -1.15931458e-04,  3.00000000e+01,
         2.20000000e+01],
       [ 1.64189960e-02, -8.05878853e-04,  2.80000000e+01,
         3.30000000e+01],
       [-1.01566985e-03, -5.93850167e-03,  4.40000000e+01,
         3.40000000e+01],
       [-2.35867899e-03, -1.15558569e-02,  1.00000000e+01,
         1.70000000e+01],
       [-4.79357759e-02, -9.79303560e-03,  1.70000000e+01,
         2.20000000e+01],
       [-2.38473991e-02, -4.09846822e-03,  1.30000000e+01,
         2.00000000e+01],
       [ 1.83287115e-02,  8.25493723e-04,  5.00000000e+00,
         1.40000000e+01],
       [ 1.11599797e-02,  2.22875151e-03,  6.00000000e+00,
         1.40000000e+01],
       [ 1.58260699e-02,  1.62366624e-03,  8.00000000e+00,
         1.90000000e+01],
       [ 1.26266272e-02, -6.98256013e-03,  4.00000000e+00,
         1.70000000e+01],
       [ 2.92400211e-02,  1.84979841e-03,  2.10000000e+01,
         1.90000000e+01],
       [ 1.06612616e-02,  1.76424265e-04,  1.40000000e+01,
         2.10000000e+01],
       [ 1.09150894e-02, -2.47691413e-03,  1.50000000e+01,
         2.50000000e+01],
       [ 1.33695994e-02, -3.12326057e-03,  1.00000000e+00,
         1.70000000e+01],
       [ 1.87217718e-02,  3.28904592e-03,  1.40000000e+01,
         1.30000000e+01],
       [ 1.82020623e-02,  1.56022811e-03,  1.70000000e+01,
         1.30000000e+01],
       [ 4.39639808e-02,  3.47301077e-03,  1.80000000e+01,
         2.40000000e+01],
       [ 2.45914486e-02,  3.15516765e-03,  2.60000000e+01,
         3.20000000e+01],
       [-6.09604853e-02,  1.28498890e-03,  4.67000000e+02,
         8.30000000e+01],
       [ 8.60279455e-03,  2.37546865e-03,  5.40000000e+01,
         4.70000000e+01],
       [ 8.78284439e-03,  4.55341989e-04,  4.00000000e+01,
         5.10000000e+01],
       [ 7.79595412e-03,  2.05751466e-03,  2.40000000e+01,
         3.60000000e+01],
       [ 1.21408706e-02,  3.18169882e-03,  3.00000000e+01,
         5.00000000e+01],
       [ 1.28508543e-02,  5.77350078e-03,  3.70000000e+01,
         2.50000000e+01],
       [ 1.05685102e-02,  3.27649057e-04,  3.20000000e+01,
         4.10000000e+01],
       [ 6.02651415e-03,  1.20542073e-03,  3.40000000e+01,
         4.60000000e+01],
       [ 1.33112517e-02,  4.73443183e-03,  4.50000000e+01,
         4.70000000e+01],
       [ 1.12804313e-02,  2.50873895e-03,  6.10000000e+01,
         5.70000000e+01],
       [ 1.07436988e-02,  2.30092366e-03,  5.40000000e+01,
         5.10000000e+01],
       [ 6.28964151e-03,  1.47745489e-03,  2.60000000e+01,
         4.00000000e+01],
       [ 1.84381016e-02,  5.55851368e-03,  2.70000000e+01,
         4.20000000e+01],
       [ 9.95141807e-03,  5.09137579e-04,  1.70000000e+01,
         4.10000000e+01],
       [ 1.21139531e-02,  1.17646978e-03,  2.30000000e+01,
         2.90000000e+01],
       [ 2.09605973e-02,  2.58136423e-03,  1.70000000e+01,
         2.40000000e+01],
       [ 1.49253323e-02,  2.96599675e-03,  2.20000000e+01,
         4.50000000e+01],
       [ 2.68143202e-02,  4.70718160e-03,  2.60000000e+01,
         3.60000000e+01],
       [ 8.64810046e-03,  8.12659154e-03,  7.00000000e+00,
         3.40000000e+01],
       [ 6.84631909e-03, -3.08115603e-04,  7.00000000e+00,
         2.40000000e+01],
       [ 7.94626956e-03, -3.55852720e-03,  9.00000000e+00,
         3.50000000e+01],
       [ 1.22113539e-02,  1.11778380e-03,  3.20000000e+01,
         4.40000000e+01],
       [ 1.11011659e-02,  6.46800547e-04,  1.40000000e+01,
         3.40000000e+01],
       [ 1.33493994e-03, -3.27214067e-03,  1.10000000e+01,
         3.30000000e+01],
       [ 6.83604778e-03, -3.89696216e-03,  3.20000000e+01,
         3.50000000e+01],
       [ 1.56219535e-03, -7.22388944e-04,  2.70000000e+01,
         3.10000000e+01],
       [ 5.47030861e-03, -2.09224923e-03,  1.30000000e+01,
         3.00000000e+01],
       [ 1.39268058e-03, -5.85377079e-03,  4.00000000e+00,
         1.60000000e+01],
       [-3.10677844e-03, -4.03658738e-03,  4.30000000e+01,
         4.20000000e+01],
       [ 1.41134966e-02,  3.45234486e-03,  2.50000000e+01,
         4.60000000e+01],
       [ 1.42938416e-02,  1.47854107e-03,  4.40000000e+01,
         4.90000000e+01],
       [ 5.54332923e-03, -2.59784237e-03,  2.00000000e+01,
         4.50000000e+01],
       [ 4.81283139e-03,  1.86618990e-03,  2.60000000e+01,
         4.10000000e+01],
       [ 9.41465019e-03, -2.18351208e-03,  2.40000000e+01,
         3.80000000e+01],
       [ 5.87476621e-03,  3.02019274e-03,  3.90000000e+01,
         2.40000000e+01],
       [-8.50593645e-03, -8.32258213e-04,  3.83000000e+02,
         1.03000000e+02],
       [ 2.20479111e-03, -3.48488882e-04,  2.80000000e+01,
         4.20000000e+01],
       [ 9.01832915e-03,  5.37287836e-04,  3.10000000e+01,
         5.60000000e+01],
       [ 4.19705263e-03,  8.82312039e-06,  1.80000000e+01,
         4.20000000e+01],
       [ 2.03876225e-02,  2.20186107e-03,  1.30000000e+01,
         3.30000000e+01],
       [ 1.15540108e-02,  2.77112917e-03,  1.50000000e+01,
         2.60000000e+01],
       [ 9.37722578e-03, -1.02116157e-03,  1.20000000e+01,
         2.60000000e+01],
       [ 1.20850930e-02,  1.54223358e-03,  1.80000000e+01,
         3.00000000e+01],
       [ 7.19561728e-03,  1.80547669e-03,  1.80000000e+01,
         3.80000000e+01],
       [ 1.35140685e-02,  1.99943003e-03,  1.50000000e+01,
         2.20000000e+01],
       [ 8.55519164e-03,  4.66630487e-03,  1.50000000e+01,
         3.40000000e+01],
       [ 1.02688777e-02,  1.33686885e-03,  1.20000000e+01,
         2.70000000e+01],
       [ 1.02517243e-02,  2.11140867e-03,  5.90000000e+01,
         5.60000000e+01],
       [ 1.95508382e-03,  1.41653798e-03,  8.20000000e+01,
         5.10000000e+01],
       [ 2.19413600e-02,  4.99284424e-03,  6.40000000e+01,
         5.50000000e+01],
       [ 1.10403022e-02,  1.46209219e-03,  4.40000000e+01,
         5.10000000e+01],
       [ 1.81575257e-02,  4.46701463e-03,  4.10000000e+01,
         5.00000000e+01],
       [ 1.81104138e-02,  1.28992251e-03,  5.80000000e+01,
         5.00000000e+01],
       [ 1.22410893e-02,  6.42356626e-03,  7.30000000e+01,
         4.90000000e+01],
       [-7.60519174e-03, -8.01749648e-04,  6.50000000e+01,
         4.80000000e+01],
       [ 7.77269394e-03, -5.64569917e-03,  8.00000000e+00,
         1.70000000e+01],
       [-2.20097764e-03, -2.53610747e-03,  1.80000000e+01,
         3.20000000e+01],
       [ 6.61417043e-02,  3.58924649e-03,  5.90000000e+01,
         4.30000000e+01],
       [ 6.88135662e-03,  2.50567533e-03,  3.80000000e+01,
         4.40000000e+01],
       [ 8.03907703e-04,  4.60597516e-03,  3.30000000e+01,
         3.00000000e+01],
       [ 1.03003734e-03,  2.64434689e-03,  2.50000000e+01,
         4.80000000e+01],
       [ 3.37361230e-03,  1.83894971e-03,  3.70000000e+01,
         2.80000000e+01],
       [-1.45189670e-02, -3.02287072e-03,  2.70000000e+01,
         3.50000000e+01],
       [ 1.02762951e-02,  4.12707930e-03,  3.40000000e+01,
         3.80000000e+01],
       [ 8.47248680e-03,  1.80469795e-03,  6.10000000e+01,
         3.90000000e+01],
       [ 4.20992899e-03,  7.25641724e-04,  3.10000000e+01,
         4.20000000e+01],
       [ 1.31395362e-02,  3.48773515e-03,  3.60000000e+01,
         4.70000000e+01],
       [ 8.61540731e-03,  4.56723472e-03,  5.00000000e+01,
         4.70000000e+01],
       [ 1.09674929e-02,  2.56158447e-03,  3.90000000e+01,
         4.80000000e+01],
       [ 7.86920644e-03,  2.23336329e-03,  3.20000000e+01,
         4.30000000e+01],
       [ 6.52102151e-03,  4.47297208e-03,  2.30000000e+01,
         4.30000000e+01],
       [ 7.03708630e-03,  9.27543239e-04,  2.10000000e+01,
         3.60000000e+01],
       [ 1.86969325e-02,  3.32530481e-03,  1.20000000e+01,
         2.70000000e+01],
       [ 9.40561809e-03,  3.20489166e-03,  1.10000000e+01,
         3.20000000e+01]])
Column 1 is zpos: the SR projection summarizing the information in repeated use of terms.
Column 2 is zzero: the SR projection summarizing the information in term inclusion.
Column 3 is m: the total number of excess counts.
Column 4 is ℓ: the total number of nonzero counts.
Counts inverse regression allows us to predict a covariate with the counts and other covariates. Here we use hdmr for the backward regression and another model for the forward regression. This can be accomplished with a single command, by fitting a CIR{HDMR,FM} where the forward model is FM <: RegressionModel.
jl.eval("using GLM: LinearModel")
spec = jl.eval("CIR{HDMR,LinearModel}")
mf = jl.eval("@model(h ~ President + Rem, c ~ President + Rem)")
cir = hd.fit(spec, mf, jlcovarsdf, jlcounts, "Rem", nocounts=True)
cir
<PyCall.jlwrap TableCountsRegressionModel{CIR{HDMR,LinearModel},DataFrame,SparseMatrixCSC{Float64,Int64}}
2-part model: [h ~ President + Rem, c ~ President + Rem]
Forward model coefficients for predicting Rem:
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                    Coef.    Std. Error      t  Pr(>|t|)    Lower 95%      Upper 95%
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(Intercept)                                    17.4837       11.3688      1.54    0.1285    -5.17963      40.147
President: PyObject 'Calvin Coolidge'          13.0088       13.4688      0.97    0.3374   -13.8407       39.8584
President: PyObject 'Donald J. Trump'          -2.14014      11.22       -0.19    0.8493   -24.5069       20.2266
President: PyObject 'Dwight D. Eisenhower'    -10.4126        8.31052    -1.25    0.2143   -26.9793        6.15409
President: PyObject 'Franklin D. Roosevelt'    -6.29263       9.24355    -0.68    0.4982   -24.7193       12.134
President: PyObject 'George H.W. Bush'        -22.2441       11.175      -1.99    0.0503   -44.5211        0.0329016
President: PyObject 'George W. Bush'          -20.2675        8.51922    -2.38    0.0200   -37.2502       -3.28473
President: PyObject 'Gerald R. Ford'           -5.36949      11.0539     -0.49    0.6286   -27.4051       16.6662
President: PyObject 'Harry S. Truman'          -9.36385       8.82709    -1.06    0.2923   -26.9603        8.23263
President: PyObject 'Herbert Hoover'          -36.1278       10.9276     -3.31    0.0015   -57.9116      -14.3439
President: PyObject 'Jimmy Carter'            -14.2189       10.3702     -1.37    0.1746   -34.8916        6.45379
President: PyObject 'John F. Kennedy'         -18.4067       11.4176     -1.61    0.1113   -41.1672        4.35383
President: PyObject 'Lyndon B. Johnson'       -18.0297        9.02763    -2.00    0.0496   -36.026        -0.0335
President: PyObject 'Richard Nixon'           -33.7795        9.89593    -3.41    0.0011   -53.5067      -14.0523
President: PyObject 'Ronald Reagan'           -16.7344        8.7381     -1.92    0.0595   -34.1535        0.684648
President: PyObject 'William J. Clinton'       -3.09842       8.2069     -0.38    0.7069   -19.4586       13.2617
zpos                                          477.131       327.085       1.46    0.1490  -174.901      1129.16
zzero                                        9002.65       2147.99        4.19    <1e-4   4720.72      13284.6
m                                              -0.0506128     0.0478887  -1.06    0.2941    -0.146077      0.0448514
ℓ                                               0.0367347     0.257807    0.14    0.8871    -0.477194      0.550663
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────>
where the nocounts=True means we also fit a benchmark model without counts. 
The last few coefficients are due to text data.
zpos is the SR projection summarizing the information in repeated use of terms.
zzero is the SR projection summarizing the information in term inclusion.
m is the total number of excess counts.
ℓ is the total number of nonzero counts.
We can get the forward and backward model coefficients with
hd.coefbwd(cir)
(array([[-7.50577915e+00, -3.28272336e+00, -5.62136487e+00, ...,
          0.00000000e+00,  0.00000000e+00, -3.84224956e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        ...,
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00, ...,
          0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
        [ 0.00000000e+00,  0.00000000e+00, -2.40875556e-02, ...,
          0.00000000e+00,  0.00000000e+00,  7.22724288e-03]]),
 array([[-0.59184185,  2.26270697,  0.04011288, ...,  0.        ,
          0.        ,  1.18279386],
        [ 0.        , -4.75743505,  0.        , ...,  0.        ,
          0.        ,  0.91905273],
        [ 0.        , -5.13624758,  0.        , ...,  0.        ,
          0.        ,  0.        ],
        ...,
        [ 0.        , -5.79083698,  0.        , ...,  0.        ,
          0.        , -0.57934297],
        [ 0.        , -6.30281465,  0.        , ...,  0.        ,
          0.        , -0.66564559],
        [ 0.        ,  0.        ,  0.        , ...,  0.        ,
          0.        ,  0.        ]]))
hd.coeffwd(cir)
array([ 1.74836878e+01,  1.30088423e+01, -2.14014450e+00, -1.04126228e+01,
       -6.29262698e+00, -2.22441239e+01, -2.02674778e+01, -5.36948700e+00,
       -9.36384785e+00, -3.61277714e+01, -1.42189166e+01, -1.84066861e+01,
       -1.80297494e+01, -3.37794987e+01, -1.67344371e+01, -3.09841666e+00,
        4.77130987e+02,  9.00265291e+03, -5.06128436e-02,  3.67346567e-02])
The fitted model can be used to predict vy with new data
jlcovarsnewdata = Main.pd_to_df(pycovarsdf.iloc[range(0,10), :])
jlcountsnewdata = Main.scipyCSC_to_julia(pycounts[range(0,10), :])
hd.predict(cir, jlcovarsnewdata, jlcountsnewdata)
[35.01327624412601, 29.846723755873985, -21.60376552127853, -24.869433502443925, -33.695746592945596, -25.101054383331906, 11.735827128269902, 19.331781366167014, 17.275545003481156, 10.8159318912295]
We can also predict only with the other covariates, which in this case is just a linear regression
hd.predict(cir, jlcovarsnewdata, jlcountsnewdata, nocounts=True)
[32.43000000000001, 32.43000000000001, -26.317499999999995, -26.317499999999995, -26.317499999999995, -26.317499999999995, 13.509166666666667, 13.509166666666667, 13.509166666666667, 13.509166666666667]
Kelly, Manela, and Moreira (2021) show that the differences between DMR and HDMR can be substantial in some cases, especially when the counts data is highly sparse.
Please reference the paper for additional details and example applications.