*19*

An F-test produces an **F-statistic**. To find the **p-value** associated with an F-statistic in R, you can use the following command:

**pf(fstat, df1, df2, lower.tail = FALSE)**

**fstat**â€“ the value of the f-statistic**df1**â€“ degrees of freedom 1**df2**â€“ degrees of freedom 2**lower.tail**â€“ whether or not to return the probability associated with the lower tail of the F distribution. This is TRUE by default.

For example, here is how to find the p-value associated with an F-statistic of 5, with degrees of freedom 1 = 3 and degrees of freedom 2 = 14:

pf(5, 3, 14, lower.tail = FALSE) #[1] 0.01457807

One of the most common uses of an F-test is for testing the overall significance of a regression model. In the following example, we show how to calculate the p-value of the F-statistic for a regression model.

**Example: Calculating p-value from F-statistic**

Suppose we have a dataset that shows the total number of hours studied, total prep exams taken, and final exam score received for 12 different students:

#create dataset dataprep_exams = c(2, 6, 5, 2, 7, 4, 4, 2, 8, 4, 1, 3),final_score = c(76, 88, 96, 90, 98, 80, 86, 89, 68, 75, 72, 76))#view first six rows of dataset head(data) # study_hours prep_exams final_score #1 3 2 76 #2 7 6 88 #3 16 5 96 #4 14 2 90 #5 12 7 98 #6 7 4 80

Next, we can fit a linear regression model to this data usingÂ *study hoursÂ *andÂ *prep examsÂ *as the predictor variables andÂ *final scoreÂ *as the response variable. Then, we can view the output of the model:

#fit regression model model #view output of the model summary(model) #Call: #lm(formula = final_score ~ study_hours + prep_exams, data = data) # #Residuals: # Min 1Q Median 3Q Max #-13.128 -5.319 2.168 3.458 9.341 # #Coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) 66.990 6.211 10.785 1.9e-06 *** #study_hours 1.300 0.417 3.117 0.0124 * #prep_exams 1.117 1.025 1.090 0.3041 #--- #Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 # #Residual standard error: 7.327 on 9 degrees of freedom #Multiple R-squared: 0.5308, Adjusted R-squared: 0.4265 #F-statistic: 5.091 on 2 and 9 DF, p-value: 0.0332

On the very last line of the output we can see that the F-statistic for the overall regression model is **5.091**. This F-statistic has 2 degrees of freedom for the numerator and 9 degrees of freedom for the denominator. R automatically calculates that the p-value for this F-statistic is** 0.0332**.

In order to calculate this equivalent p-value ourselves, we could use the following code:

pf(5.091, 2, 9, lower.tail = FALSE) #[1] 0.0331947

Notice that we get the same answer (but with more decimals displayed) as the linear regression output above.