Descriptive statistics are used to summarize and describe the main features of a dataset. In MATLAB, there are various built-in functions that allow you to compute these statistics easily. This section will cover the following key concepts:

  1. Measures of Central Tendency: Mean, Median, Mode
  2. Measures of Dispersion: Range, Variance, Standard Deviation
  3. Other Descriptive Statistics: Minimum, Maximum, Percentiles, Interquartile Range (IQR)
  4. Practical Examples
  5. Exercises

  1. Measures of Central Tendency

Mean

The mean (average) is the sum of all data points divided by the number of data points.

data = [1, 2, 3, 4, 5];
mean_value = mean(data);
disp(mean_value); % Output: 3

Median

The median is the middle value of a dataset when it is ordered.

data = [1, 2, 3, 4, 5];
median_value = median(data);
disp(median_value); % Output: 3

Mode

The mode is the value that appears most frequently in a dataset.

data = [1, 2, 2, 3, 4];
mode_value = mode(data);
disp(mode_value); % Output: 2

  1. Measures of Dispersion

Range

The range is the difference between the maximum and minimum values in a dataset.

data = [1, 2, 3, 4, 5];
range_value = range(data);
disp(range_value); % Output: 4

Variance

Variance measures the spread of the data points around the mean.

data = [1, 2, 3, 4, 5];
variance_value = var(data);
disp(variance_value); % Output: 2.5

Standard Deviation

The standard deviation is the square root of the variance and provides a measure of the average distance of each data point from the mean.

data = [1, 2, 3, 4, 5];
std_dev = std(data);
disp(std_dev); % Output: 1.5811

  1. Other Descriptive Statistics

Minimum and Maximum

The minimum and maximum values in a dataset can be found using the min and max functions.

data = [1, 2, 3, 4, 5];
min_value = min(data);
max_value = max(data);
disp([min_value, max_value]); % Output: [1, 5]

Percentiles

Percentiles indicate the value below which a given percentage of observations fall.

data = [1, 2, 3, 4, 5];
percentile_25 = prctile(data, 25);
percentile_75 = prctile(data, 75);
disp([percentile_25, percentile_75]); % Output: [2, 4]

Interquartile Range (IQR)

The IQR is the range between the 25th and 75th percentiles and measures the spread of the middle 50% of the data.

data = [1, 2, 3, 4, 5];
iqr_value = iqr(data);
disp(iqr_value); % Output: 2

  1. Practical Examples

Example 1: Descriptive Statistics of a Dataset

data = [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 8, 9, 10];

mean_value = mean(data);
median_value = median(data);
mode_value = mode(data);
range_value = range(data);
variance_value = var(data);
std_dev = std(data);
min_value = min(data);
max_value = max(data);
percentile_25 = prctile(data, 25);
percentile_75 = prctile(data, 75);
iqr_value = iqr(data);

disp('Descriptive Statistics:');
disp(['Mean: ', num2str(mean_value)]);
disp(['Median: ', num2str(median_value)]);
disp(['Mode: ', num2str(mode_value)]);
disp(['Range: ', num2str(range_value)]);
disp(['Variance: ', num2str(variance_value)]);
disp(['Standard Deviation: ', num2str(std_dev)]);
disp(['Minimum: ', num2str(min_value)]);
disp(['Maximum: ', num2str(max_value)]);
disp(['25th Percentile: ', num2str(percentile_25)]);
disp(['75th Percentile: ', num2str(percentile_75)]);
disp(['IQR: ', num2str(iqr_value)]);

  1. Exercises

Exercise 1: Basic Descriptive Statistics

Given the dataset [4, 8, 6, 5, 3, 7, 9, 2], calculate the mean, median, mode, range, variance, and standard deviation.

Solution:

data = [4, 8, 6, 5, 3, 7, 9, 2];

mean_value = mean(data);
median_value = median(data);
mode_value = mode(data);
range_value = range(data);
variance_value = var(data);
std_dev = std(data);

disp('Descriptive Statistics:');
disp(['Mean: ', num2str(mean_value)]);
disp(['Median: ', num2str(median_value)]);
disp(['Mode: ', num2str(mode_value)]);
disp(['Range: ', num2str(range_value)]);
disp(['Variance: ', num2str(variance_value)]);
disp(['Standard Deviation: ', num2str(std_dev)]);

Exercise 2: Percentiles and IQR

Given the dataset [15, 20, 35, 40, 50], calculate the 25th percentile, 75th percentile, and the interquartile range (IQR).

Solution:

data = [15, 20, 35, 40, 50];

percentile_25 = prctile(data, 25);
percentile_75 = prctile(data, 75);
iqr_value = iqr(data);

disp('Percentiles and IQR:');
disp(['25th Percentile: ', num2str(percentile_25)]);
disp(['75th Percentile: ', num2str(percentile_75)]);
disp(['IQR: ', num2str(iqr_value)]);

Conclusion

In this section, we covered the fundamental concepts of descriptive statistics in MATLAB, including measures of central tendency, measures of dispersion, and other descriptive statistics. We also provided practical examples and exercises to reinforce the concepts. Understanding these basics is crucial for data analysis and will prepare you for more advanced statistical techniques in the upcoming sections.

© Copyright 2024. All rights reserved