Introduction

Measures of central tendency are statistical metrics that describe the center point or typical value of a dataset. They provide a single value that represents the entire distribution of data. The three most common measures of central tendency are the mean, median, and mode.

Key Concepts

  1. Mean

The mean, often referred to as the average, is calculated by summing all the values in a dataset and then dividing by the number of values.

Formula: \[ \text{Mean} (\mu) = \frac{\sum_{i=1}^{n} x_i}{n} \]

Example: Consider the dataset: 4, 8, 6, 5, 3, 7 \[ \text{Mean} = \frac{4 + 8 + 6 + 5 + 3 + 7}{6} = \frac{33}{6} = 5.5 \]

  1. Median

The median is the middle value of a dataset when it is ordered in ascending or descending order. If the dataset has an even number of observations, the median is the average of the two middle numbers.

Steps to Calculate Median:

  1. Arrange the data in ascending order.
  2. If the number of observations (n) is odd, the median is the middle number.
  3. If n is even, the median is the average of the two middle numbers.

Example: Consider the dataset: 4, 8, 6, 5, 3, 7

  1. Arrange in ascending order: 3, 4, 5, 6, 7, 8
  2. Number of observations (n) = 6 (even)
  3. Median = \(\frac{5 + 6}{2} = 5.5\)

  1. Mode

The mode is the value that appears most frequently in a dataset. A dataset may have one mode, more than one mode, or no mode at all.

Example: Consider the dataset: 4, 8, 6, 5, 3, 7, 8

  • Mode = 8 (since 8 appears twice, more frequently than any other number)

Comparison of Measures

Measure Calculation Method Best Used When Example
Mean Sum of all values divided by the number of values Data is symmetrically distributed without outliers Average test scores
Median Middle value of ordered data Data has outliers or is skewed Median household income
Mode Most frequently occurring value Data has repeating values Most common shoe size in a store

Practical Examples

Example 1: Calculating Mean, Median, and Mode

Consider the following dataset representing the number of books read by students in a month: 2, 3, 3, 5, 7, 8, 9

Mean: \[ \text{Mean} = \frac{2 + 3 + 3 + 5 + 7 + 8 + 9}{7} = \frac{37}{7} \approx 5.29 \]

Median:

  • Ordered dataset: 2, 3, 3, 5, 7, 8, 9
  • Number of observations (n) = 7 (odd)
  • Median = 5 (middle value)

Mode:

  • Mode = 3 (appears twice)

Example 2: Handling Outliers

Consider the dataset: 10, 12, 14, 16, 18, 100

Mean: \[ \text{Mean} = \frac{10 + 12 + 14 + 16 + 18 + 100}{6} = \frac{170}{6} \approx 28.33 \]

Median:

  • Ordered dataset: 10, 12, 14, 16, 18, 100
  • Number of observations (n) = 6 (even)
  • Median = \(\frac{14 + 16}{2} = 15\)

Mode:

  • No mode (all values are unique)

In this case, the median (15) is a better measure of central tendency than the mean (28.33) because the mean is heavily influenced by the outlier (100).

Exercises

Exercise 1: Calculate Mean, Median, and Mode

Given the dataset: 5, 7, 8, 5, 10, 12, 5, 8

  1. Calculate the mean.
  2. Determine the median.
  3. Identify the mode.

Solution:

  1. Mean: \[ \text{Mean} = \frac{5 + 7 + 8 + 5 + 10 + 12 + 5 + 8}{8} = \frac{60}{8} = 7.5 \]

  2. Median:

  • Ordered dataset: 5, 5, 5, 7, 8, 8, 10, 12
  • Number of observations (n) = 8 (even)
  • Median = \(\frac{7 + 8}{2} = 7.5\)
  1. Mode:
  • Mode = 5 (appears three times)

Exercise 2: Impact of Outliers

Given the dataset: 1, 2, 2, 3, 4, 100

  1. Calculate the mean.
  2. Determine the median.
  3. Identify the mode.

Solution:

  1. Mean: \[ \text{Mean} = \frac{1 + 2 + 2 + 3 + 4 + 100}{6} = \frac{112}{6} \approx 18.67 \]

  2. Median:

  • Ordered dataset: 1, 2, 2, 3, 4, 100
  • Number of observations (n) = 6 (even)
  • Median = \(\frac{2 + 3}{2} = 2.5\)
  1. Mode:
  • Mode = 2 (appears twice)

Conclusion

In this section, we covered the three primary measures of central tendency: mean, median, and mode. Each measure provides a different perspective on the central value of a dataset and is useful in different scenarios. Understanding these measures is fundamental for analyzing and summarizing data effectively. In the next section, we will explore measures of dispersion, which describe the spread of data around the central tendency.

© Copyright 2024. All rights reserved