*58*

The **Kolmogorov-Smirnov test **is used to test whether or not or not a sample comes from a certain distribution.

To perform a one-sample or two-sample Kolmogorov-Smirnov test in R we can use the ks.test() function.

This tutorial shows example of how to use this function in practice.

**Example 1: One Sample Kolmogorov-Smirnov Test**

Suppose we have the following sample data:

#make this example reproducible seed(0) #generate dataset of 100 values that follow a Poisson distribution with mean=5 data 20, lambda=5)

**Related:Â **A Guide to dpois, ppois, qpois, and rpois in R

The following code shows how to perform a Kolmogorov-Smirnov test on this sample of 100 data values to determine if it came from a normal distribution:

#perform Kolmogorov-Smirnov test ks.test(data, "pnorm") One-sample Kolmogorov-Smirnov test data: data D = 0.97725, p-value

From the output we can see that the test statistic isÂ **0.97725 **and the corresponding p-value is **2.2e-16**. Since the p-value is less than .05, we reject the null hypothesis. We have sufficient evidence to say that the sample data does not come from a normal distribution.

This result shouldnâ€™t be surprising since we generated the sample data using the **rpois() **function, which generates random values that follow a Poisson distribution.

**Example 2: Two Sample Kolmogorov-Smirnov Test**

Suppose we have the following two sample datasets:

#make this example reproducible seed(0) #generate two datasets data1 20, lambda=5) data2 100)

The following code shows how to perform a Kolmogorov-Smirnov test on these two samples to determine if they came from the same distribution:

#perform Kolmogorov-Smirnov test ks.test(data1, data2) Two-sample Kolmogorov-Smirnov test data: data1 and data2 D = 0.99, p-value = 1.299e-14 alternative hypothesis: two-sided

From the output we can see that the test statistic isÂ **0.99 **and the corresponding p-value is **1.299e-14**. Since the p-value is less than .05, we reject the null hypothesis. We have sufficient evidence to say that the two sample datasets do not come from the same distribution.

This result also shouldnâ€™t be surprising since we generated values for the first sample using the Poisson distribution and values for the second sample using the normal distribution.

**Additional Resources**

How to Perform a Shapiro-Wilk Test in R

How to Perform an Anderson-Darling Test in R

How to Perform Multivariate Normality Tests in R