Mixture Model Parameter Inference

Visser & Speekenbrink: Sec 2.4, 2.5

Likelihood Ratio Test

The likelihood ratio test is used to test constraints on the parameter space. The statistics is defined as

Λ=2(l(θ^uy)l(θ^ry))ΛX2(v)v=dim(θ^u)dim(θ^r)\begin{align*} & \Lambda = -2 * \left( l(\hat{\theta}_u|y) - l(\hat{\theta}_r|y) \right) \\ & \Lambda \sim \Chi^2(v) \\ & v = dim(\hat{\theta}_u) - dim(\hat{\theta}_r) \end{align*}
  • θ^u\hat{\theta}_u is the MLE parameters of the unconstrained model, and θ^r\hat{\theta}_r is the MLE parameters of the constrained model.

  • The distribution assumes

    • Constrained model estimates are the true parameters

    • Under some regularity conditions

    • True parameter vectors is an interior point of the parameter space (cannot lie on the boundary, such as standard deviation or mixing probability equals 0)

If the constrained model estimates are the true parameters, then under some regularity conditions, the statistics is asymptotically chi-squared distributed with degrees of freedom

Code Example

In depmixS4, the likelihood ratio can be performed using llratio function

// Some code

# Create and fit mixture model 
spmix2 <- mix(RT~1, data = speed1, nstates = 2)
fspmix2 <- fit(spmix2, verbose = FALSE)

pst <- getpars(fspmix2)
pst[c(4,6)] <- 0.25 # manual set some parameters to create a constrained model 
spm2 <- setpars(spmix2, pst)
fsmix2c <- fit(spm2, equal = c(1,1,1,2,1,2), method = "rsolnp")
# The equal argument is used to specify equality constraints: 
## parameters with same integer number in this vector are estimated to be equal
## except 0 and 1, which indicates free and fixed parameter. 
llratio(fspmix2, fspmix2c)

Standard Errors and Hessian Matrix

Under regularity conditions, the MLE estimates asymptotically follow a multivariate Gaussian distribution

θ^Normal(θ,Σθ^)Σθ^=I(θ)1/TJ(θ)=1T[2θiθjl(θy1:T)θ]Σ^θ^=J(θ)1/T=[2θiθj1l(θy1:T)θ^]1=H1\begin{align*} & \hat{\theta} \sim Normal(\theta^*, \Sigma_{\hat{\theta}}) \\ & \Sigma_{\hat{\theta}} = I(\theta^*)^{-1} / T \\ & J(\theta) = -\frac{1}{T} \left[ \frac{\partial^2}{\partial \theta_i \theta_j} l(\theta|y_{1:T}) | \theta \right] \\ & \hat{\Sigma}_{\hat{\theta}} = J(\theta)^{-1} / T = \left[ \frac{\partial^2}{\partial \theta_i \theta_j} -1 * l(\theta|y_{1:T}) | \hat{\theta} \right]^{-1}=H^{-1} \end{align*}
  • TT is the number of observation

  • I(θ)I(\theta^*) is the Fisher information matrix. This can be approximated with the observed Fisher information matrix J(θ)J(\theta)

  • The last line shows that the actual variance matrix of the multivariate Gaussain can be approximated with the inverse of the Hessian matrix HH of the minus log-likelihood evaluated at estimated parameters θ^\hat{\theta}

This means that if we know the Hessian matrix HH, the standard error of the parameters estimate can be obtained via the square root of the diagonal elements of the inverse.

Obtain Hessian as part of numerical optimization

If the log-likelihood is optimized using direct numerical optimization, the Hessian can be estimated during the optimization. For example:

# estimate 
nlmEst <- nlm(GaussMix2, pstart, y = y, hessian = TRUE) 
# GaussMix2 is a function that takes in parameter value and y, and then reutrn the negative log-likelihood 

# obtain hessian 
hess1<- nlmEst$hessian 
# get square-root of the diagonal of the inverse 
nlmSes <- sqrt(diag(solve(hess1)))

Obtain Hessian via Finite Difference Approximation

require(numDeriv)
opt <- optim(pstart, fn = GaussMix2, y = speed1$RT, lower = c(0, - Inf, -Inf, 0, 0), upper = c(1, rep(Inf, 4)), method = "L-BFGS-B")
hess2 <- hessian(GaussMix2, opt$par, y=y)

require(nlme)
hess3 <- fdHess(opt$par, GaussMix2, y = y)$Hessian 
# here par can be the estimated parameter from optim 

depmixS4 provides vcov(), confit(), standardError() methods for variance-covarince, confidence interval, and standard error.

Obtain Hessia via Parametric Bootstrap

If it is possible to simulate data from the fitted mixture model, the variance - covariance matrix can be obtained via

  1. Fit a mixture model to the data and obtain estimates

  2. Repeat

    1. Use the fitted model to simulate data by first draw a component distribution and then draw a data point from the component distribution

    2. Fit the mixture model to the simulated data and obtain estimate

  3. Order the estimate from simulation process to obtain sample quantiles.

require(boot) 

speed.rg <- function(data, mle) {
    simulate(data)
} 

speed.fun <- function(data){
    getpars(fit(data, verbose = FALSE, emcontrol = em.control(random.start = FALSE))) }
    
speed_boot_par <- boot(fspmix2, speed.fun, R = 1000, sim = "parametric", ran.gen = speed.rg) 
  • boot first argument is observed data, but in this case we pass the fitted fspmix2 object.

  • boot second argument is a function that takes in data and return a vector of parameters

  • boot ran.gen argument requires a function of two argument, first argument should be the observed data and second argument consist of any other information. The function should return the simulated dataset. In this case the "data" being passed is the fitted model instead of actual observed data. So we don't need to pass any actual mle from the boot function call.

Linear Constraint Correction of Hessian

C^=D^1D^1AT[AD^1AT]1AD^1D^=H^+ATA\begin{align*} & \hat{C} = \hat{D}^{-1} -\hat{D}^{-1} A^T [A\hat{D}^{-1}A^T]^{-1} A\hat{D}^{-1} \\ & \hat{D} = \hat{H} + A^T A \end{align*}
  • H^\hat{H} is approximated Hessian

  • AA is a c×pc \times p matrix of the Jacobian of the linear constraints (cc is the number of contraints)

Last updated