Home Â» How to Perform a Likelihood Ratio Test in R

# How to Perform a Likelihood Ratio Test in R

A likelihood ratio test compares the goodness of fit of two nested regression models.

A nested model is simply one that contains a subset of the predictor variables in the overall regression model.

For example, suppose we have the following regression model with four predictor variables:

Y = Î²0Â + Î²1x1Â + Î²2x2Â + Î²3x3Â + Î²4x4Â + Îµ

One example of a nested model would be the following model with only two of the original predictor variables:

Y = Î²0Â + Î²1x1Â + Î²2x2Â +Â Îµ

To determine if these two models are significantly different, we can perform a likelihood ratio test which uses the following null and alternative hypotheses:

H0: The full model and the nested model fit the data equally well. Thus, you should use the nested model.

HA: The full model fits the data significantly better than the nested model. Thus, you should use the full model.

If the p-value of the test is below a certain significance level (e.g. 0.05), then we can reject the null hypothesis and conclude that the full model offers a significantly better fit.

The following example shows how to perform a likelihood ratio test in R.

### Example: Likelihood Ratio Test in R

The following code shows how to fit the following two regression models in R using data from the built-in mtcars dataset:

Full model: mpg = Î²0 + Î²1disp + Î²2carb + Î²3hpÂ + Î²4cyl

Reduced model: mpg = Î²0 + Î²1disp + Î²2carb

We will use theÂ lrtest() function from theÂ lmtest package to perform a likelihood ratio test on these two models:

library(lmtest)

#fit full model
model_full #fit reduced model
model_reduced #perform likelihood ratio test for differences in models
lrtest(model_full, model_reduced)

Likelihood ratio test

Model 1: mpg ~ disp + carb + hp + cyl
Model 2: mpg ~ disp + carb
#Df  LogLik Df  Chisq Pr(>Chisq)
1   6 -77.558
2   4 -78.603 -2 2.0902     0.3517

From the output we can see that the Chi-Squared test-statistic is 2.0902Â and the corresponding p-value isÂ 0.3517.

Since this p-value is not less than .05, we will fail to reject the null hypothesis.

This means the full model and the nested model fit the data equally well. Thus, we should use the nested model because the additional predictor variables in the full model donâ€™t offer a significant improvement in fit.

We could then carry out another likelihood ratio test to determine if a model with only one predictor variable is significantly different from a model with the two predictors:

library(lmtest)

#fit full model
model_full #fit reduced model
model_reduced #perform likelihood ratio test for differences in models
lrtest(model_full, model_reduced)

Likelihood ratio test

Model 1: mpg ~ disp + carb
Model 2: mpg ~ disp
#Df  LogLik Df  Chisq Pr(>Chisq)
1   4 -78.603
2   3 -82.105 -1 7.0034   0.008136 **
---
Signif. codes:  0 â€˜***â€™ 0.001 â€˜**â€™ 0.01 â€˜*â€™ 0.05 â€˜.â€™ 0.1 â€˜ â€™ 1

From the output we can see that the p-value of the likelihood ratio test isÂ 0.008136. Since this is less than .05, we would reject the null hypothesis.

Thus, we would conclude that the model with two predictors offers a significant improvement in fit over the model with just one predictor.

Thus, our final model would be:

mpg = Î²0 + Î²1disp + Î²2carb