Home Â» How to Calculate Median Absolute Deviation in Python

# How to Calculate Median Absolute Deviation in Python

The median absolute deviation measures the spread of observations in a dataset.

Itâ€™s a particularly useful metric because itâ€™s less affected by outliers than other measures of dispersion like standard deviation and variance.

The formula to calculate median absolute deviation, often abbreviated MAD, is as follows:

where:

• xi: The ith value in the dataset
• xm: The median value in the dataset

The following examples shows how to calculate the median absolute deviation in Python by using theÂ mad function from statsmodels.

### Example 1: Calculate MAD for an Array

The following code shows how to calculate the median absolute deviation for a single NumPy array in Python:

```import numpy as np
from statsmodels import robust

#define data
data = np.array([1, 4, 4, 7, 12, 13, 16, 19, 22, 24])

11.1195
```

The median absolute deviation for the dataset turns out to beÂ 11.1195.

Itâ€™s important to note that the formula used to calculate MAD computes a robust estimate of the standard deviation assuming a normal distribution by scaling the result by a factor of roughly 0.67.

To avoid using this scaling factor, simply set c = 1 as follows:

```#calculate MAD without scaling factor

7.5```

### Example 2: Calculate MAD for a DataFrame

The following code shows how to calculate MAD for a single column in a pandas DataFrame:

```#make this example reproducible
np.random.seed(1)

#create pandas DataFrame
data = pd.DataFrame(np.random.randint(0, 10, size=(5, 3)), columns=['A', 'B', 'C'])

#view DataFrame
data

A	B	C
0	5	8	9
1	5	0	0
2	1	7	6
3	9	2	4
4	5	2	4

B    2.965204
dtype: float64```

The median absolute deviation for column BÂ turns out to be 2.965204.

We can use similar syntax to calculate MAD for multiple columns in the pandas DataFrame:

```#calculate MAD for all columns