*58*

Two terms that are sometimes used interchangeably are **correlation** and **association**. However, in the field of statistics these two terms have slightly different meanings.

In particular, when we use the word **correlation** we’re typically talking about the Pearson Correlation Coefficient. This is a measure of the linear association between two random variables *X *and *Y. *It has a value between -1 and 1 where:

- -1 indicates a perfectly negative linear correlation between two variables
- 0 indicates no linear correlation between two variables
- 1 indicates a perfectly positive linear correlation between two variables

Conversely, when statisticians use the word **association** they can be talking about *any* relationship between two variables, whether it’s linear *or* non-linear.

To illustrate this idea, consider the following examples.

**Visualizing Correlation vs. Association with Scatterplots**

We use two words to describe the correlation between two random variables:

**1. Direction**

**Positive:**Two random variables have a positive correlation if*Y*tends to increase as*X*increases.**Negative:**Two random variables have a negative correlation if*Y*tends to decrease as*X*increases.

**2. Strength**

**Weak:**Two random variables have a weak correlation if the points in a scatterplot are loosely scattered.**Strong:**Two random variables have a strong correlation if the points in a scatterplot are tightly packed together.

The following scatterplots illustrate examples of each type of correlation:

Compared to correlation, the word **association** can tell us whether or not there is *any* relationship between two random variables: linear *or* non-linear.

The following scatterplots illustrate some examples:

The scatterplot in the top left corner illustrates a quadratic relationship between two random variables, which means there *is* an association between the two variables but it’s not a linear one.

If we calculated the correlation between the two variables, it would likely be close to zero because there is no linear relationship between them.

However, just knowing that the correlation between the two variables is zero can be misleading because it hides the fact there there exists a non-linear relationship instead.

**Correlation vs. Association: A Summary**

The terms correlation and association have the following similarities and differences:

**Similarities:**

- Both terms are used to describe whether or not there is a relationship between two random variables.
- Both terms can use scatterplots to analyze the relationship bewteen two random variables.

**Differences:**

- Correlation can only tell us if two random variables have a linear relationship while association can tell us if two random variables have a linear
*or*non-linear relationship. - Correlation quantifies the relationship between two random variables by using a number between -1 and 1, but association does not use a specific number to quantify a relationship.

**Additional Resources**

An Introduction to the Pearson Correlation Coefficient

An Introduction to Scatterplots

Correlation vs. Regression: What’s the Difference?