*18*

The **Student t distribution** is one of the most commonly used distribution in statistics. This tutorial explains how to work with the Student t distribution in R using the functions **dt()**,Â **qt()**,Â **pt()**, andÂ **rt()**.

**dt**

The function **dtÂ **returns the value of the probability density function (pdf) of the Student t distribution given a certain random variable *xÂ *and degrees of freedom *df*. The syntax for using dt is as follows:

**dt(x, df)Â **

The following code illustrates a few examples of **dtÂ **in action:

#find the value of the Student t distribution pdf at x = 0 with 20 degrees of freedom dt(x = 0, df = 20) #[1] 0.3939886 #by default, R assumes the first argument isxand the second argument isdfdt(0, 20) #[1] 0.3939886 #find the value of the Student t distribution pdf at x = 1 with 30 degrees of freedom dt(1, 30) #[1] 0.2379933

Typically when youâ€™re trying to solve questions about probability using the Student t distribution, youâ€™ll often use **ptÂ **instead of **dt**. One useful application of **dt**, however, is in creating a Student t distribution plot in R. The following code illustrates how to do so:

#Create a sequence of 100 equally spaced numbers between -4 and 4 x #create a vector of values that shows the height of the probability distribution #for each value in x, using 20 degrees of freedom y #plot x and y as a scatterplot with connected lines (type = "l") and add #an x-axis with custom labels plot(x,y, type = "l", lwd = 2, axes = FALSE, xlab = "", ylab = "") axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))

This generates the following plot:

**pt**

The function **ptÂ **returns the value of the cumulative density function (cdf) of the Student t distribution given a certain random variable *xÂ *and degrees of freedomÂ *df*.Â The syntax for using pnorm is as follows:

**pt(x, df)Â **

Put simply, **ptÂ **returns the area to the left of a given valueÂ *xÂ *in the Student t distribution. If youâ€™re interested in the area to the right of a given valueÂ *x*, you can simply add the argument **lower.tail = FALSE**

**pt(x, df, lower.tail = FALSE)Â **

The following examples illustrates how to solve some probability questions using pt.

**Example 1:***Â Find the area to the left of a t-statistic with value of -0.785 and 14 degrees of freedom.*

pt(-0.785, 14) #[1] 0.2227675

**Example 2:***Â Find the area to the right of a t-statistic with value of -0.785 and 14 degrees of freedom.*

#the following approaches produce equivalent results #1 - area to the left 1 - pt(-0.785, 14) #[1] 0.7772325 #area to the right pt(-0.785, 14, lower.tail = FALSE) #[1] 0.7772325

**Example 3:***Â Find the total area in a Student t distribution with 14 degrees of freedom that lies to theÂ left ofÂ -0.785 or to the right of 0.785.*

pt(-0.785, 14) + pt(0.785, 14, lower.tail = FALSE) #[1] 0.4455351

**qt**

The function **qtÂ **returns the value of the inverse cumulative density function (cdf) of the Student t distribution given a certain random variable *xÂ *and degrees of freedomÂ *df.Â *The syntax for using qt is as follows:

**qt(x,Â df)Â **

Put simply, you can use **qtÂ **toÂ find out what the t-score is of the p^{th} quantile of the Student t distribution.

The following code illustrates a few examples of **qtÂ **in action:

#find the t-score of the 99th quantile of the Student t distribution with df = 20 qt(.99, df = 20) # [1] [1] 2.527977 #find the t-score of the 95th quantile of the Student t distribution with df = 20 qt(.95, df = 20) # [1] 1.724718 #find the t-score of the 90th quantile of the Student t distribution with df = 20 qt(.9, df = 20) # [1] 1.325341

Note that the critical values found byÂ **qtÂ **will match the critical values found in the t-Distribution table as well as the critical values that can be found by the Inverse t-Distribution Calculator.

**rt**

The function **rtÂ **generates a vector of random variables that follow a Student t distribution given a vector lengthÂ *nÂ *and degrees of freedomÂ *df*. The syntax for using rt is as follows:

**rt(n, df)Â **

The following code illustrates a few examples of **rtÂ **in action:

#generate a vector of 5 random variables that follow a Student t distribution #with df = 20 rt(n = 5, df = 20) #[1] -1.7422445 0.9560782 0.6635823 1.2122289 -0.7052825 #generate a vector of 1000 random variables that follow a Student t distribution #with df = 40 narrowDistribution #generate a vector of 1000 random variables that follow a Student t distribution #with df = 5 wideDistribution #generate two histograms to view these two distributions side by side, and specify #50 bars in histogram, par(mfrow=c(1, 2)) #one row, two columns hist(narrowDistribution, breaks=50, xlim = c(-6, 4)) hist(wideDistribution, breaks=50, xlim = c(-6, 4))

This generates the following histograms:

Notice how the wide distribution is more spread out compared to the narrow distribution. This is because we specified the degrees of freedom in the wide distribution to be 5 compared to 40Â in the narrow distribution. The fewer degrees of freedom, the wider the Student t distribution will be.

**Further Reading: A Guide to dnorm, pnorm, qnorm, and rnorm in R A Guide to dbinom, pbinom, qbinom, and rbinom in R **