*57*

A **box plot **is a type of plot that we can use to visualize the five number summary of a dataset, which includes:

- The minimum
- The first quartile
- The median
- The third quartile
- The maximum

This tutorial explains how to create and modify box plots in Stata.

**Example: Box Plots in Stata**

We’ll use a dataset called *auto *to illustrate how to create and modify boxplots in Stata.

First, load the data by typing the following into the Command box and clicking *Enter*:

use http://www.stata-press.com/data/r13/auto

**Vertical Box Plots**

We can create a vertical box plot for the variable *mpg *by using the **graph box **command:

graph box mpg

**Horizontal Box Plots**

Alternatively, we can create a horizontal box plot by using the **graph hbox **command:

graph hbox mpg

**Box Plots by Category**

We can also create several box plots based on a single categorical variable using the **over() **command. For example, the following command can be used to create box plots that show the distribution of *mpg*, based on the categorical variable *foreign*, which indicates whether a car is foreign or domestic.

graph box mpg, over(foreign)

**Multiple Box Plots by Category**

We can also create box plots for more than one variable based on a categorical variable. For example, the following command can be used to create box plots for the variables *headroom *and *gear_ratio*, based on the categorical variable *foreign*:

graph box headroom gear_ratio, over(foreign)

**Modifying the Appearance of Box Plots**

We can use several different commands to modify the appearance of the box plots.

We can add a title to the plot using the **title() **command:

graph box mpg, title(“Distribution of mpg”)

We can also add a subtitle underneath the title using the **subtitle() **command:

graph box mpg, title(“Distribution of mpg”) subtitle(“(sample size = 74 cars)”)

We can also add a note or comment at the bottom of the graph by using the **note() **command:

graph box mpg, note(“Source: 1978 Automobile Data”)

Lastly, we can change the actual color of the box plot by using the **box(variable #, color(color_choice))** command:

graph box mpg, box(1, color(green))

A full list of available colors can be found in the Stata Documentation.