Statistics Instruction Manual

Calculate and Display Statistics

THE DATA SET

To calculate and display statistics, there always has to be a data set that shows what we are calculating and displaying statistics for. In this case, the data set we are using consists of the amount of shoes that each student has in the Common Core first period.


Data Set: 5, 12, 2, 2, 3, 12, 10, 14, 9, 6, 12


Sometimes for calculations in statistics, depending on what you're trying to calculate, you need to have your data set in order from least to greatest.


Ordered Data Set: 2, 2, 3, 5, 6, 9, 10, 12, 12, 12, 14

Calculating Mean

To calculate mean:


  • Add up all the numbers in the data set
  • Divide the sum by how many numbers there are

1: 2+2+3+5+6+9+10+12+12+12+14 = 87

2: 87/11=7.91


Mean: 7.91


Take Note: This calculation can be done by hand or by calculator.



Calculating Median

To calculate median:
  1. Put your data set in order from least to greatest:
  • 2 2 3 5 6 9 10 12 12 12 14

2.Eliminate the numbers on the ends of the data set until you can't evenly eliminate them anymore:
  • 2 2 3 5 6 9 10 12 12 12 14

Take Note:

If the data set has an even amount of numbers, there will be two numbers left in the center, if that's the case: calculate the average of those two numbers; if those two numbers are the same, then that number is the median. If the data set has an odd amount of numbers, there will only be one number in the center, and that number will be the median.


The Five-Number Summary

The Five-number summary consists of:
  1. the Lower Extreme value
  2. the First Quartile
  3. the Median
  4. the Third Quartile
  5. and the Upper Extreme value
The five-number summary can be used to find the basic statistics of a set of data, so as to picture the distribution.

Calculating:
  • Lower Extreme value: find the number with which the data set begins with (when data is ordered)
  • First Quartile: find the median of the set of numbers that range from the median of the entire data and the lower extreme value
  • Median: (scroll up to where it says "calculating median")
  • Third Quartile: find the median of the set of numbers that range from the median of the entire data and the upper extreme value
  • Upper Extreme value: find the number with which the data set ends with (when data is ordered)
Take Note: DO NOT COUNT THE MEDIAN WHEN CALCULATING THE QUARTILES

In this case, the:
  1. Lower Extreme value is 2
  2. First Quartile is 3
  3. Median is 9
  4. Third Quartile is 12
  5. Upper Extreme value is 14

Range and Interquartile Range

The range and IQR can be used to visualize the center of a distribution of a set of data.

To calculate range:
  • Subtract the lower extreme value from the upper extreme value in the data set

To calculate interquartile range (IQR):
  • Subtract the first quartile from the third quartile

In this case, the range is 12 and the IQR is 9.

Outliers

Outliers are values that lie out of the range of the cluster in your data set, and they can stretch the center of your distribution if the center of data is not resistant to outliers.

To calculate outliers:
  • Find the interquartile range
  • Multiply the interquartile range by 1.5
  • Add the product of step 2 to the IQR; any number greater than this sum, is an outlier.
  • Subtract the product of step 2 from the IQR; any number less than this difference, is an outlier.

In this case, any numbers less than -4.5 (which doesn't make sense in our situation) or greater than 22.5, are outliers.

Resistance and Sensitivity to outliers:
The median and mean are the summary statistics that get affected by outliers, but one of them is resistant to outliers and the other is not. The median is resistant to outliers because outliers don't usually affect where the median number lies in the set of data, because the median will only change by a few numbers. The mean is sensitive to outliers because an outlier can pull the center of the data, to greater or lesser values.

Standard Deviation

Standard deviation is a form of measuring variability within the center two thirds of a distribution. Standard deviation can only be calculated by calculator, if exact calculations want to be found.

To calculate standard deviation:
  • Press STAT, then EDIT, then enter the ordered data set
  • Go back to STAT and go right towards CALC
  • Press 1-VAR Stats, press 2ND and look for the button where the L1 is in blue
  • Press calculate, and "voila!"
  • The standard deviation is at Sx:

In this case, the standard deviation is 4.46

Center, Spread, Shape

The center of a distribution can be measured by mean and median, and occasionally standard deviation, and it's used to measure variability within approximately two thirds of the distribution.


The spread of a distribution can be measured by range and interquartile range, and it shows how the distribution spreads or ranges.


The shape of a distribution can be seen when the statistics are in a histogram, a box-plot, or a dot plot, and can be determined by how high or low or spread the bars, dots, or lines are. When a distribution is skewed to the left, the tail of the shape is pointing towards the lower values. When a distribution is skewed to the right, the tail of the distribution is pointing towards the higher values. When the distribution is approximately normal, you can see an almost symmetrical mount that represents that the distribution is balanced.