*53*

The **median absolute deviation** measures the spread of observations in a dataset.

Itâ€™s a particularly useful metric because itâ€™s less affected by outliers than other measures of dispersion like standard deviation and variance.

The formula to calculate median absolute deviation, often abbreviated MAD, is as follows:

**MAD = median(|x _{i} â€“ x_{m}|)**

where:

**x**The i_{i}:^{th}value in the dataset**x**The median value in the dataset_{m}:

The following examples shows how to calculate the median absolute deviation in Python by using theÂ **mad** function from statsmodels.

**Example 1: Calculate MAD for an Array**

The following code shows how to calculate the median absolute deviation for a single NumPy array in Python:

import numpy as np from statsmodels import robust #define data data = np.array([1, 4, 4, 7, 12, 13, 16, 19, 22, 24]) #calculate MAD robust.mad(data) 11.1195

The median absolute deviation for the dataset turns out to beÂ **11.1195**.

Itâ€™s important to note that the formula used to calculate MAD computes a robust estimate of the standard deviation assuming a normal distribution by scaling the result by a factor of roughly 0.67.

To avoid using this scaling factor, simply set c = 1 as follows:

#calculate MAD without scaling factor robust.mad(data, c=1) 7.5

**Example 2: Calculate MAD for a DataFrame**

The following code shows how to calculate MAD for a single column in a pandas DataFrame:

#make this example reproducible np.random.seed(1) #create pandas DataFrame data = pd.DataFrame(np.random.randint(0, 10, size=(5, 3)), columns=['A', 'B', 'C']) #view DataFrame data A B C 0 5 8 9 1 5 0 0 2 1 7 6 3 9 2 4 4 5 2 4 #calculate MAD for columnBdata[['B']].apply(robust.mad) B 2.965204 dtype: float64

The median absolute deviation for column *B*Â turns out to be **2.965204**.

We can use similar syntax to calculate MAD for multiple columns in the pandas DataFrame:

#calculate MAD for all columns data[['A', 'B', 'C']].apply(robust.mad) A 0.000000 B 2.965204 C 2.965204 dtype: float64

The median absolute deviation is **0 **for column A, **2.965204** for column B, and **2.965204 **for column C.

**Additional Resources**

How to Calculate MAPE in Python

How to Calculate SMAPE in Python

How to Calculate RMSE in Python