# Statistics Instruction Manual

### Calculate and Display Statistics

## THE DATA SET

__Data Set__: 5, 12, 2, 2, 3, 12, 10, 14, 9, 6, 12

Sometimes for calculations in statistics, depending on what you're trying to calculate, you need to have your data set in order from least to greatest.

__Ordered Data Set__: 2, 2, 3, 5, 6, 9, 10, 12, 12, 12, 14

## Calculating Mean

__To calculate mean:__

- Add up all the numbers in the data set
- Divide the sum by how many numbers there are

**1:**2+2+3+5+6+9+10+12+12+12+14 = 87

**2:** 87/11=7.91

__Mean:__** 7.91**

**Take Note: **This calculation can be done by hand or by calculator.

## Calculating Median

__To calculate median:__

- Put your data set in order from least to greatest:

- 2 2 3 5 6 9 10 12 12 12 14

2.Eliminate the numbers on the ends of the data set until you can't evenly eliminate them anymore:

__2 2 3 5 6__**9**__10 12 12 12 14__

**Take Note**:

If the data set has an even amount of numbers, there will be two numbers left in the center, if that's the case: calculate the average of those two numbers; if those two numbers are the same, then that number is the median. If the data set has an odd amount of numbers, there will only be one number in the center, and that number will be the median.

## The Five-Number Summary

__The Five-number summary consists of:__

- the Lower Extreme value
- the First Quartile
- the Median
- the Third Quartile
- and the Upper Extreme value

__Calculating:__

**Lower Extreme value**: find the number with which the data set begins with (when data is ordered)**First Quartile**: find the median of the set of numbers that range from the median of the entire data and the lower extreme value**Median**: (scroll up to where it says "calculating median")**Third Quartile**: find the median of the set of numbers that range from the median of the entire data and the upper extreme value**Upper Extreme value**: find the number with which the data set ends with (when data is ordered)

**Take Note:**DO NOT COUNT THE MEDIAN WHEN CALCULATING THE QUARTILES

In this case, the:

- Lower Extreme value is 2
- First Quartile is 3
- Median is 9
- Third Quartile is 12
- Upper Extreme value is 14

## Range and Interquartile Range

__To calculate range__:

- Subtract the lower extreme value from the upper extreme value in the data set

__To calculate interquartile range (IQR)__:

- Subtract the first quartile from the third quartile

In this case, the range is 12 and the IQR is 9.

## Outliers

__To calculate outliers__:

- Find the interquartile range
- Multiply the interquartile range by 1.5
- Add the product of step 2 to the IQR; any number greater than this sum, is an outlier.
- Subtract the product of step 2 from the IQR; any number less than this difference, is an outlier.

In this case, any numbers less than -4.5 (which doesn't make sense in our situation) or greater than 22.5, are outliers.

**Resistance and Sensitivity to outliers**:

The median and mean are the summary statistics that get affected by outliers, but one of them is resistant to outliers and the other is not. The median is resistant to outliers because outliers don't usually affect where the median number lies in the set of data, because the median will only change by a few numbers. The mean is sensitive to outliers because an outlier can pull the center of the data, to greater or lesser values.

## Standard Deviation

__To calculate standard deviation__:

- Press STAT, then EDIT, then enter the ordered data set
- Go back to STAT and go right towards CALC
- Press 1-VAR Stats, press 2ND and look for the button where the L1 is in blue
- Press calculate, and "voila!"
- The standard deviation is at Sx:

In this case, the standard deviation is 4.46

## The Dot PlotThe dot plot is mainly a form of displaying or showing the shape of a distribution and its frequency, that uses a line and marks, typically x's, that lay above the number they represent. The more marks above a number, the greater the frequency. | ## The HistogramThe histogram is mainly a form of displaying or showing the shape and center of a distribution, and the frequency, that uses bars and intervals, to display the data. The frequency of numbers within an interval can be see by how high a bar is. | ## The Box PlotThe box plot is mainly a form of displaying or showing the five-number summary, the range, and the interquartile range, where you can easily find what they are. |

## The Dot Plot

The dot plot is mainly a form of displaying or showing the shape of a distribution and its frequency, that uses a line and marks, typically x's, that lay above the number they represent. The more marks above a number, the greater the frequency.

## The Histogram

The histogram is mainly a form of displaying or showing the shape and center of a distribution, and the frequency, that uses bars and intervals, to display the data. The frequency of numbers within an interval can be see by how high a bar is.

## Center, Spread, Shape

**center**of a distribution can be measured by

__mean and median__, and occasionally standard deviation, and it's used to measure variability within approximately two thirds of the distribution.

The **spread** of a distribution can be measured by__ range and interquartile range__, and it shows how the distribution spreads or ranges.

The **shape **of a distribution can be seen when the statistics are in a histogram, a box-plot, or a dot plot, and can be determined by how high or low or spread the bars, dots, or lines are. When a distribution is skewed to the left, the tail of the shape is pointing towards the lower values. When a distribution is skewed to the right, the tail of the distribution is pointing towards the higher values. When the distribution is approximately normal, you can see an almost symmetrical mount that represents that the distribution is balanced.