*59*

The **Shapiro-Wilk test **is a test of normality. It is used to determine whether or not a sample comes from a normal distribution.

To perform a Shapiro-Wilk test in Python we can use the scipy.stats.shapiro() function, which takes on the following syntax:

**scipy.stats.shapiro(x)**

where:

**x:Â**An array of sample data.

This function returns a test statistic and a corresponding p-value.

If the p-value is below a certain significance level, then we have sufficient evidence to say that the sample data does not come from a normal distribution.

This tutorial shows a couple examples of how to use this function in practice.

**Example 1: Shapiro-Wilk Test on Normally Distributed Data**

Suppose we have the following sample data:

from numpy.random import seed from numpy.random import randn #set seed (e.g. make this example reproducible) seed(0) #generate dataset of 100 random values that follow a standard normal distribution data = randn(100)

The following code shows how to perform a Shapiro-Wilk test on this sample of 100 data values to determine if it came from a normal distribution:

from scipy.stats import shapiro #perform Shapiro-Wilk test shapiro(data) ShapiroResult(statistic=0.9926937818527222, pvalue=0.8689165711402893)

From the output we can see that the test statistic isÂ **0.9927Â **and the corresponding p-value isÂ **0.8689**.

Since the p-value is not less than .05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that the sample data does not come from a normal distribution.

This result shouldnâ€™t be surprising since we generated the sample data using theÂ **randn()Â **function, which generates random values that follow a standard normal distribution.

**Example 2: Shapiro-Wilk Test on Non-Normally Distributed Data**

Now suppose we have the following sample data:

from numpy.random import seed from numpy.random import poisson #set seed (e.g. make this example reproducible) seed(0) #generate dataset of 100 values that follow a Poisson distribution with mean=5 data = poisson(5, 100)

The following code shows how to perform a Shapiro-Wilk test on this sample of 100 data values to determine if it came from a normal distribution:

from scipy.stats import shapiro #perform Shapiro-Wilk test shapiro(data) ShapiroResult(statistic=0.9581913948059082, pvalue=0.002994443289935589)

From the output we can see that the test statistic isÂ **0.9582 **and the corresponding p-value isÂ **0.00299**.

Since the p-value is less than .05, we reject the null hypothesis. We have sufficient evidence to say that the sample data does not come from a normal distribution.

This result also shouldnâ€™t be surprising since we generated the sample data using the **poisson() **function, which generates random values that follow a Poisson distribution.

**Additional Resources**

The following tutorials explain how to perform other normality tests in various statistical software:

How to Perform a Shapiro-Wilk Test in R

How to Perform an Anderson-Darling Test in Python

How to Perform a Kolmogorov-Smirnov Test in Python