*51*

A **pairs plotÂ **is a matrix of scatterplots that lets you understand the pairwise relationship between different variables in a dataset.

Fortunately itâ€™s easy to create a pairs plot in R by using the pairs() function. This tutorial provides several examples of how to use this function in practice.

**Example 1: Pairs Plot of All Variables**

The following code illustrates how to create a basic pairs plot for all variables in a data frame in R:

#make this example reproducible set.seed(0) #create data frame var1 #create pairs plot pairs(df)

The way to interpret the matrix is as follows:

- The variable names are shown along the diagonals boxes.
- All other boxes display a scatterplot of the relationship between each pairwise combination of variables. For example, the box in the top right corner of the matrix displays a scatterplot of values forÂ
**var1Â**andÂ**var3**. The box in the middle left displays a scatterplot of values for**var1Â**andÂ**var2**, and so on.

This single plot gives us an idea of the relationship between each pair of variables in our dataset. For example,Â **var1** andÂ **var2** seem to be positively correlated whileÂ **var1** andÂ **var3** seem to have little to no correlation.

**Example 2: Pairs Plot of Specific Variables**

The following code illustrates how to create a basic pairs plot for just the first two variables in a dataset:

#create pairs plot for var1 and var2 only pairs(df[, 1:2])

**Example 3: Modify the Aesthetics of a Pairs Plot**

The following code illustrates how to modify the aesthetics of a pairs plot, including the title, the color, and the labels:

pairs(df, col = 'blue', #modify color labels = c('First', 'Second', 'Third'), #modify labels main = 'Custom Title') #modify title

**Example 4: Obtaining Correlations with ggpairs**

You can also obtain the Pearson correlation coefficient between variables by using the **ggpairs()** function from the GGally library. The following code illustrates how to use this function:

#install necessary libraries install.packages('ggplot2') install.packages('GGally') #load libraries library(ggplot2) library(GGally) #create pairs plot ggpairs(df)

The way to interpret this matrix is as follows:

- The variable names are displayed on the outer edges of the matrix.
- The boxes along the diagonals display the density plot for each variable.
- The boxes in the lower left corner display the scatterplot between each variable.
- The boxes in the upper right corner display the Pearson correlation coefficient between each variable. For example, the correlation between var1 and var2 isÂ
**0.425**.

The benefit of usingÂ **ggpairs()Â **over the base R function **pairs()** is that you can obtain more information about the variables. Specifically, you can see the correlation coefficient between each pairwise combination of variables as well as a density plot for each individual variable.

*You can find the complete documentation for the ggpairs() function here.*