*62*

One common error you may encounter in R is:

Error in `contrasts

This error occurs when you attempt to fit a regression model using a predictor variable that is either a factor or character and only has one unique value.

This tutorial shares the exact steps you can use to troubleshoot this error.

**Example: How to Fix â€˜contrasts can be applied only to factors with 2 or more levelsâ€™**

Suppose we have the following data frame in R:

#create data frame df frame(var1=c(1, 3, 3, 4, 5), var2=as.factor(4), var3=c(7, 7, 8, 3, 2), var4=c(1, 1, 2, 8, 9)) #view data frame df var1 var2 var3 var4 1 1 4 7 1 2 3 4 7 1 3 3 4 8 2 4 4 4 3 8 5 5 4 2 9

Notice that the predictor variable **var2** is a factor and only has one unique value.

If we attempt to fit a multiple linear regression model using **var2** as one of the predictor variables, weâ€™ll get the following error:

#attempt to fit regression model model

We get this error because **var2** only has one unique value: 4. Since there isnâ€™t any variation at all in this predictor variable, R is unable to effectively fit a regression model.

We can actually use the following syntax to count the number of unique values for each variable in our data frame:

#count unique values for each variable sapply(lapply(df, unique), length) var1 var2 var3 var4 4 1 4 4

And we can use the lapply() function to display each of the unique values for each variable:

#display unique values for each variable lapply(df[c('var1', 'var2', 'var3')], unique) $var1 [1] 1 3 4 5 $var2 [1] 4 Levels: 4 $var3 [1] 7 8 3 2

We can see thatÂ **var2** is the only variable that has one unique value. Thus, we can fix this error by simply dropping var2 from the regression model:

#fit regression model without usingvar2as a predictor variable model #view model summary summary(model) Call: lm(formula = var4 ~ var1 + var3, data = df) Residuals: 1 2 3 4 5 0.02326 -1.23256 0.91860 0.53488 -0.24419 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.4070 3.6317 2.315 0.1466 var1 0.6279 0.6191 1.014 0.4172 var3 -1.1512 0.3399 -3.387 0.0772 . --- Signif. codes: 0 â€˜***â€™ 0.001 â€˜**â€™ 0.01 â€˜*â€™ 0.05 â€˜.â€™ 0.1 â€˜ â€™ 1 Residual standard error: 1.164 on 2 degrees of freedom Multiple R-squared: 0.9569, Adjusted R-squared: 0.9137 F-statistic: 22.18 on 2 and 2 DF, p-value: 0.04314

By dropping **var2** from the regression model, we no longer encounter the error from earlier.

**Additional Resources**

How to Perform Simple Linear Regression in R

How to Perform Multiple Linear Regression in R

How to Perform Logistic Regression in R