Statistics Instruction Manual
Calculate and Display Statistics
THE DATA SET
Data Set: 5, 12, 2, 2, 3, 12, 10, 14, 9, 6, 12
Sometimes for calculations in statistics, depending on what you're trying to calculate, you need to have your data set in order from least to greatest.
Ordered Data Set: 2, 2, 3, 5, 6, 9, 10, 12, 12, 12, 14
Calculating Mean
- Add up all the numbers in the data set
- Divide the sum by how many numbers there are
1: 2+2+3+5+6+9+10+12+12+12+14 = 87
2: 87/11=7.91
Mean: 7.91
Take Note: This calculation can be done by hand or by calculator.
Calculating Median
- Put your data set in order from least to greatest:
- 2 2 3 5 6 9 10 12 12 12 14
2.Eliminate the numbers on the ends of the data set until you can't evenly eliminate them anymore:
- 2 2 3 5 6 9 10 12 12 12 14
Take Note:
If the data set has an even amount of numbers, there will be two numbers left in the center, if that's the case: calculate the average of those two numbers; if those two numbers are the same, then that number is the median. If the data set has an odd amount of numbers, there will only be one number in the center, and that number will be the median.
The Five-Number Summary
- the Lower Extreme value
- the First Quartile
- the Median
- the Third Quartile
- and the Upper Extreme value
Calculating:
- Lower Extreme value: find the number with which the data set begins with (when data is ordered)
- First Quartile: find the median of the set of numbers that range from the median of the entire data and the lower extreme value
- Median: (scroll up to where it says "calculating median")
- Third Quartile: find the median of the set of numbers that range from the median of the entire data and the upper extreme value
- Upper Extreme value: find the number with which the data set ends with (when data is ordered)
In this case, the:
- Lower Extreme value is 2
- First Quartile is 3
- Median is 9
- Third Quartile is 12
- Upper Extreme value is 14
Range and Interquartile Range
To calculate range:
- Subtract the lower extreme value from the upper extreme value in the data set
To calculate interquartile range (IQR):
- Subtract the first quartile from the third quartile
In this case, the range is 12 and the IQR is 9.
Outliers
To calculate outliers:
- Find the interquartile range
- Multiply the interquartile range by 1.5
- Add the product of step 2 to the IQR; any number greater than this sum, is an outlier.
- Subtract the product of step 2 from the IQR; any number less than this difference, is an outlier.
In this case, any numbers less than -4.5 (which doesn't make sense in our situation) or greater than 22.5, are outliers.
Resistance and Sensitivity to outliers:
The median and mean are the summary statistics that get affected by outliers, but one of them is resistant to outliers and the other is not. The median is resistant to outliers because outliers don't usually affect where the median number lies in the set of data, because the median will only change by a few numbers. The mean is sensitive to outliers because an outlier can pull the center of the data, to greater or lesser values.
Standard Deviation
To calculate standard deviation:
- Press STAT, then EDIT, then enter the ordered data set
- Go back to STAT and go right towards CALC
- Press 1-VAR Stats, press 2ND and look for the button where the L1 is in blue
- Press calculate, and "voila!"
- The standard deviation is at Sx:
In this case, the standard deviation is 4.46
The Dot Plot
The dot plot is mainly a form of displaying or showing the shape of a distribution and its frequency, that uses a line and marks, typically x's, that lay above the number they represent. The more marks above a number, the greater the frequency.
The Histogram
The histogram is mainly a form of displaying or showing the shape and center of a distribution, and the frequency, that uses bars and intervals, to display the data. The frequency of numbers within an interval can be see by how high a bar is.
The Box Plot
The box plot is mainly a form of displaying or showing the five-number summary, the range, and the interquartile range, where you can easily find what they are.
Center, Spread, Shape
The spread of a distribution can be measured by range and interquartile range, and it shows how the distribution spreads or ranges.
The shape of a distribution can be seen when the statistics are in a histogram, a box-plot, or a dot plot, and can be determined by how high or low or spread the bars, dots, or lines are. When a distribution is skewed to the left, the tail of the shape is pointing towards the lower values. When a distribution is skewed to the right, the tail of the distribution is pointing towards the higher values. When the distribution is approximately normal, you can see an almost symmetrical mount that represents that the distribution is balanced.