githubEdit

Bayesian Multivarate Regression with Regularization

In the Bayesian context, regularization is performed by applying appropriate prior distribution to the regression coefficients. Some of the common priors include

  • Laplace prior

  • Spike-and-slab

  • Horseshoe and hierarchical shrinkage

Traditionally, horse-shoe prior is defined as

θjλj,τN(0,λj2τ2)λjC+(0,1)\theta_j|\lambda_j, \tau \sim N(0, \lambda_j^2\tau^2) \\ \lambda_j\sim C^+(0,1)

where τ\tauis the global shrinkage parameter and λj\lambda_jis the local shrinkage parameter.

# Data preparation 
import pytensor.tensor as at
import arviz as az
import pandas as pd

test_scores = pd.read_csv(pm.get_data("test_scores.csv"), index_col=0)
X = test_scores.dropna().astype(float)
y = X.pop("score")
# Standardization 
X -= X.mean()
X /= X.std()
N, D = X.shape
D0 = int(D / 2)

The horseshoe prior for each regression coefficient βi\beta_iis

βiN(0,τ2λ~i2)τHalf-StudentT2(D0DD0σN)λ~i2=c2λi2c2+τ2λi2λiHalf-StudentT5(1)c2InverseGamma(1,1)\begin{align*} &\beta_i \sim N(0, \tau^2*\tilde{\lambda}_i^2) \\ &\tau\sim \text{Half-StudentT}_2(\frac{D_0}{D-D_0}*\frac{\sigma}{\sqrt{N}})\\ &\tilde{\lambda}_i^2=\frac{c^2\lambda_i^2}{c^2+\tau^2\lambda_i^2} \\ & \lambda_i \sim \text{Half-StudentT}_5(1) \\ & c^2 \sim \text{InverseGamma(1,1)} \end{align*}
  • σ\sigmais the prior on the error standard deviation

  • τ\tau follows a Half-StudentT distribution (just one option as a heavy-tailed distribution with non-zero probability

  • D0D_0 represents the true number of non-zero coefficients. We guess it to be half of the overall parameters. This doesn't have to be a super good guess.

  • c2c^2 need to be strictly positive, but not necessarily long-tail. hence the inverse gamma prior.

Finally, to make the sampling process more efficient in PyMC, we re-parameterize the coefficient as

ziN(0,1)βi=ziτλ~iz_i \sim N(0,1)\\ \beta_i = z_i * \tau * \tilde{\lambda}_i

Last updated