Statistics in python

Allwin Raju
5 min readMay 17, 2021
Photo by Chris Liverani on Unsplash

Python has a built-in library called statistics. This module provides functions for calculating mathematical statistics of numeric (Real-valued) data.

Unless explicitly noted, these functions support int, float, Decimal and Fraction.

Some of the most common methods supported by this module are,

  1. mean()
  2. fmean()
  3. geometric_mean()
  4. median()
  5. median_low()
  6. median_high()
  7. mode()
  8. multimode()

Let us look at the above methods one by one with an example for each.

1. mean()

The first method is the mean(). This method returns the sample arithmetic mean of data which can be a sequence or iterable.

The arithmetic mean is the sum of the data divided by the number of data points. If we find are finding the average without this module we have to do some additional tasks like finding the number of elements in the iterable, finding the sum of the elements in the iterable, and then dividing them both.

The following code demonstrates how to find the mean without the statistics module.

finding the average in python

Let us see how to find the average using the mean() method from the statistics module. We have to import the mean() method from the statistics module and pass the iterable to this method.

finding average using the mean() method

Note that the output is an integer. In case if the result of the mean is a float value the output will also be a float value.

2. fmean()

This method converts data to floats and then computes the mean. This runs faster than the mean() function and it always returns a float. The data may be a sequence or iterable. If the input dataset is empty, raises a StatisticsError.

This method is available in python 3.8+

fmean() in python

Note that the same example from above yields a float value with the fmean() method.

3. geometric_mean()

This method will convert data to floats and compute the geometric mean.

In mathematics, the geometric mean is a mean or average, which indicates the central tendency or typical value of a set of numbers by using the product of their values (as opposed to the arithmetic mean which uses their sum).

For example, the geometric mean calculation can be easily understood with simple numbers, such as 2 and 8. If you multiply 2 and 8, then take the square root (the ½ power since there are only 2 numbers), the answer is 4.

geometric mean in python

If data is empty, StatisticsError will be raised.

4. median()

The median() method will return the median (middle value) of numeric data, using the common “mean of the middle two” method.

When the number of values is odd, the middle value is returned.

The median() method with an odd number of values

When the number of values is even, the median is interpolated by taking the average of the two middle values.

the median() method with an even number of values

If data is empty, StatisticsError is raised. data can be a sequence or iterable.

5. median_low()

The median_low() will not affect the data with an odd number of values. The middle element will be returned as usual.

But, if the values are even then instead of finding the average of the middle two values it will return the smallest of the two values.

median low in python

The two middle values 3 and 4 are considered. 3, which is the smaller among these two is returned.

6. median_high()

This method is similar to the median_low() method. But, instead of returning the smallest of two, this method will return the largest of two.

median high in python

7. mode()

The mode() method will return the most frequent data point from discrete or nominal data. The mode (when it exists) is the most typical value and serves as a measure of central location.

If there are multiple modes with the same frequency, returns the first one encountered in the data. If the input data is empty, StatisticsError is raised.

mode method in statistics

3 is the most frequent element from the above list. The mode method also supports non-numerical data such as strings.

mode with a list of strings

‘a’ is the most frequent element in the above list.

8. multimode()

Unlike the mode() method if the list has multiple elements of the same frequency, the multimode() method will return them all as a list.

multimode in python

The element which is encountered first is added to the list first, the second element next, and so on. In the above example, 1 and 2 both have a frequency of 2 and they both are returned in the list.

This method also works with a list of strings.

multimode with a string of lists

Conclusion

Hope this article is helpful. Happy coding!

--

--