Home » How to Create and Modify Box Plots in Stata

# How to Create and Modify Box Plots in Stata

A box plot is a type of plot that we can use to visualize the five number summary of a dataset, which includes:

• The minimum
• The first quartile
• The median
• The third quartile
• The maximum This tutorial explains how to create and modify box plots in Stata.

## Example: Box Plots in Stata

We’ll use a dataset called auto to illustrate how to create and modify boxplots in Stata.

First, load the data by typing the following into the Command box and clicking Enter:

use http://www.stata-press.com/data/r13/auto

### Vertical Box Plots

We can create a vertical box plot for the variable mpg by using the graph box command:

graph box mpg ### Horizontal Box Plots

Alternatively, we can create a horizontal box plot by using the graph hbox command:

graph hbox mpg ### Box Plots by Category

We can also create several box plots based on a single categorical variable using the over() command. For example, the following command can be used to create box plots that show the distribution of mpg, based on the categorical variable foreign, which indicates whether a car is foreign or domestic.

graph box mpg, over(foreign) ### Multiple Box Plots by Category

We can also create box plots for more than one variable based on a categorical variable. For example, the following command can be used to create box plots for the variables headroom and gear_ratio, based on the categorical variable foreign: ### Modifying the Appearance of Box Plots

We can use several different commands to modify the appearance of the box plots.

We can add a title to the plot using the title() command:

graph box mpg, title(“Distribution of mpg”) We can also add a subtitle underneath the title using the subtitle() command:

graph box mpg, title(“Distribution of mpg”) subtitle(“(sample size = 74 cars)”) We can also add a note or comment at the bottom of the graph by using the note() command:

graph box mpg, note(“Source: 1978 Automobile Data”) Lastly, we can change the actual color of the box plot by using the box(variable #, color(color_choice)) command:

graph box mpg, box(1, color(green)) A full list of available colors can be found in the Stata Documentation.