*53*

When weâ€™d like to test whether or not a single variable is normally distributed, we can create aÂ Q-Q plotÂ to visualize the distribution or we can perform a formal statistical test like anÂ Anderson Darling TestÂ or aÂ Jarque-Bera Test.

However, when weâ€™d like to test whether or notÂ *severalÂ *variables are normally distributed as a group we must perform aÂ **multivariate normality test**.

This tutorial explains how to perform the Henze-Zirkler multivariate normality test for a given dataset in Python.

**Related:Â **If weâ€™d like to identify outliers in a multivariate setting, we can use theÂ Mahalanobis distance.

**Example: Henze-Zirkler Multivariate Normality Test in Python**

The **Henze-Zirkler Multivariate Normality Test **determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H_{0}Â (null): The variables follow a multivariate normal distribution.

H_{a}Â (alternative): The variablesÂ *do notÂ *follow a multivariate normal distribution.

To perform this test in Python we can use the multivariate_normality() function from the pingouin library.

First, we need to install pingouin:

pip install pingouin

Next, we can import theÂ **multivariate_normality()Â **function and use it to perform a Multivariate Test for Normality for a given dataset:

#import necessary packages from pingouin import multivariate_normality import pandas as pd import numpy as np #create a dataset with three variables x1, x2, and x3 df = pd.DataFrame({'x1':np.random.normal(size=50), 'x2': np.random.normal(size=50), 'x3': np.random.normal(size=50)}) #perform the Henze-Zirkler Multivariate Normality Test multivariate_normality(df, alpha=.05) HZResults(hz=0.5956866563391165, pval=0.6461804077893423, normal=True)

The results of the test are as follows:

**H-Z Test Statistic:Â**0.59569**p-value:Â**0.64618

Since the p-value of the test is not less than our specified alpha value of .05, we fail to reject the null hypothesis. The dataset can be assumed to follow a multivariate normal distribution.

**Related:** Learn how the Henze-Zirkler test is used in real-life medical applications in this research paper.