A scatterplot is a type of plot that we can use to display the relationship between two variables. It helps us visualize both the direction (positive or negative) and the strength (weak, moderate, strong) of the relationship between the two variables.
This tutorial explains how to create and modify scatterplots in Stata.
How to Create Scatterplots in Stata
We’ll use a dataset called auto to illustrate how to create and modify scatterplots in Stata.
First, load the data by typing the following into the Command box:
We can get a quick look at the dataset by typing the following into the Command box:
We can see that there are 12 total variables in the dataset.
We can create a scatterplot for the variables weight and length by using the scatter command. The first variable you type will go along the y-axis and the second variable will go along the x-axis:
scatter weight length
We can see that there is a strong positive correlation between weight and length. That is, cars that weigh more also tend to be longer.
Scatterplot with a Regression Line
You can add a simple linear regression line to the scatterplot by using two “pipe” symbols || along with the lfit command:
scatter weight length || lfit weight length
Scatterplot with Multiple Variables
You can create a scatterplot with more than two variables by simply typing more variables after the scatter command. Note that the last variable you type will be used for the x-axis.
For example, the following command tells Stata to create a scatterplot using length as the x-axis variable and weight and displacement as the y-axis variables:
scatter weight displacement length
How to Modify Scatterplots in Stata
We can use several different commands to modify the appearance of the scatterplots.
Adding a Title
We can add a title to the plot using the title() command:
scatter weight length, title(“Weight vs. Length”)
Adding a Subtitle
We can also add a subtitle underneath the title using the subtitle() command:
scatter weight length, title(“Weight vs. Length”) subtitle(“n = 74 cars”)
Adding a Comment
We can also add a note or comment at the bottom of the graph by using the note() command:
scatter weight length, note(“Source: 1978 Automobile Data”)
We can change the color of the points in the scatterplot by using the mcolor() command:
scatter weight length, mcolor(green)
A full list of available colors can be found in the Stata Documentation for colors.
We can also change the shape of the points in the scatterplot by using the msymbol() command. For example the symbol “D” corresponds to diamonds:
scatter weight length, msymbol(D)
A full list of available shapes can be found in the Stata Documentation for shapes.