review of top 10 concepts in statistics (reordered slightly for review the interactive session)
DESCRIPTION
Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session). NOTE: This Power Point file is not an introduction, but rather a checklist of topics to review. Top Ten #10. Qualitative vs. Quantitative. Qualitative. Categorical data: success vs. failure - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/1.jpg)
Review of Top 10 Conceptsin Statistics
(reordered slightly for review the interactive session)
NOTE: This Power Point file is not an introduction, but rather a checklist of topics to review
![Page 2: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/2.jpg)
Top Ten #10
Qualitative vs. Quantitative
![Page 3: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/3.jpg)
Qualitative
Categorical data:
success vs. failure
ethnicity
marital status
color
zip code
4 star hotel in tour guide
![Page 4: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/4.jpg)
Qualitative
If you need an “average”, do not calculate the mean
However, you can compute the mode (“average” person is married, buys a blue car made in America)
![Page 5: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/5.jpg)
Quantitative
Two cases Case 1: discrete Case 2: continuous
![Page 6: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/6.jpg)
Discrete
(1) integer values (0,1,2,…)
(2) example: binomial
(3) finite number of possible values
(4) counting
(5) number of brothers
(6) number of cars arriving at gas station
![Page 7: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/7.jpg)
Continuous
Real numbers, such as decimal values ($22.22)
Examples: Z, t Infinite number of possible values Measurement Miles per gallon, distance, duration of time
![Page 8: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/8.jpg)
Graphical Tools
Pie chart or bar chart: qualitative Joint frequency table: qualitative (relate
marital status vs zip code) Scatter diagram: quantitative (distance from
CSUN vs duration of time to reach CSUN)
![Page 9: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/9.jpg)
Hypothesis TestingConfidence Intervals
Quantitative: Mean Qualitative: Proportion
![Page 10: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/10.jpg)
Top Ten #9
Population vs. Sample
![Page 11: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/11.jpg)
Population
Collection of all items (all light bulbs made at factory)
Parameter: measure of population
(1) population mean (average number of hours in life of all bulbs)
(2) population proportion (% of all bulbs that are defective)
![Page 12: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/12.jpg)
Sample
Part of population (bulbs tested by inspector) Statistic: measure of sample = estimate of
parameter(1) sample mean (average number of hours in life of bulbs tested by inspector)(2) sample proportion (% of bulbs in sample that are defective)
![Page 13: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/13.jpg)
Top Ten #1
Descriptive Statistics
![Page 14: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/14.jpg)
Measures of Central Location
Mean Median Mode
![Page 15: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/15.jpg)
Mean Population mean =µ= Σx/N = (5+1+6)/3 = 12/3 =
4 Algebra: Σx = N*µ = 3*4 =12 Sample mean = x-bar = Σx/n Example: the number of hours spent on the
Internet: 4, 8, and 9 x-bar = (4+8+9)/3 = 7 hours Do NOT use if the number of observations is
small or with extreme values Ex: Do NOT use if 3 houses were sold this week,
and one was a mansion
![Page 16: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/16.jpg)
Median Median = middle value Example: 5,1,6
Step 1: Sort data: 1,5,6 Step 2: Middle value = 5
When there is an even number of observation, median is computed by averaging the two observations in the middle.
OK even if there are extreme values Home sales: 100K,200K,900K, so
mean =400K, but median = 200K
![Page 17: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/17.jpg)
Mode Mode: most frequent value Ex: female, male, female
Mode = female Ex: 1,1,2,3,5,8
Mode = 1 It may not be a very good measure, see the
following example
![Page 18: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/18.jpg)
Measures of Central Location - Example
Sample: 0, 0, 5, 7, 8, 9, 12, 14, 22, 23
Sample Mean = x-bar = Σx/n = 100/10 = 10 Median = (8+9)/2 = 8.5 Mode = 0
![Page 19: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/19.jpg)
Relationship
Case 1: if probability distribution symmetric (ex. bell-shaped, normal distribution), Mean = Median = Mode
Case 2: if distribution positively skewed to right (ex. incomes of employers in large firm: a large number of relatively low-paid workers and a small number of high-paid executives), Mode < Median < Mean
![Page 20: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/20.jpg)
Relationship – cont’d
Case 3: if distribution negatively skewed to left (ex. The time taken by students to write exams: few students hand their exams early and majority of students turn in their exam at the end of exam), Mean < Median < Mode
![Page 21: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/21.jpg)
Dispersion – Measures of Variability
How much spread of data How much uncertainty Measures
Range Variance Standard deviation
![Page 22: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/22.jpg)
Range
Range = Max-Min > 0 But range affected by unusual values Ex: Santa Monica has a high of 105 degrees
and a low of 30 once a century, but range would be 105-30 = 75
![Page 23: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/23.jpg)
Standard Deviation (SD)
Better than range because all data used Population SD = Square root of variance
=sigma =σ SD > 0
![Page 24: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/24.jpg)
Empirical Rule
Applies to mound or bell-shaped curves
Ex: normal distribution 68% of data within + one SD of mean 95% of data within + two SD of mean 99.7% of data within + three SD of mean
![Page 25: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/25.jpg)
Standard Deviation = Square Root of Variance
1
)( 2
n
xxs
![Page 26: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/26.jpg)
Sample Standard Deviationx
6 6-8=-2 (-2)(-2)= 4
6 6-8=-2 4
7 7-8=-1 (-1)(-1)= 1
8 8-8=0 0
13 13-8=5 (5)(5)= 25
Sum=40 Sum=0 Sum = 34
Mean=40/5=8
xx 2)( xx
![Page 27: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/27.jpg)
Standard Deviation
Total variation = 34 Sample variance = 34/4 = 8.5 Sample standard deviation =
square root of 8.5 = 2.9
![Page 28: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/28.jpg)
Measures of Variability - Example
The hourly wages earned by a sample of five students are:
$7, $5, $11, $8, and $6
Range: 11 – 5 = 6
Variance:
Standard deviation:
30.5
15
2.21
15
4.76...4.77
1
222
2
n
XXs
30.230.52 ss
![Page 29: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/29.jpg)
Graphical Tools
Line chart: trend over time Scatter diagram: relationship between two
variables Bar chart: frequency for each category Histogram: frequency for each class of
measured data (graph of frequency distr.) Box plot: graphical display based on
quartiles, which divide data into 4 parts
![Page 30: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/30.jpg)
Top Ten #8
Variation Creates Uncertainty
![Page 31: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/31.jpg)
No Variation
Certainty, exact prediction Standard deviation = 0 Variance = 0 All data exactly same Example: all workers in minimum wage job
![Page 32: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/32.jpg)
High Variation
Uncertainty, unpredictable High standard deviation Ex #1: Workers in downtown L.A. have variation
between CEOs and garment workers Ex #2: New York temperatures in spring range
from below freezing to very hot
![Page 33: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/33.jpg)
Comparing Standard Deviations
Temperature Example Beach city: small standard deviation (single
temperature reading close to mean) High Desert city: High standard deviation (hot
days, cool nights in spring)
![Page 34: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/34.jpg)
Standard Error of the Mean
Standard deviation of sample mean =
standard deviation/square root of n
Ex: standard deviation = 10, n =4, so standard error of the mean = 10/2= 5
Note that 5<10, so standard error < standard deviation.
As n increases, standard error decreases.
![Page 35: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/35.jpg)
Sampling Distribution
Expected value of sample mean = population mean, but an individual sample mean could be smaller or larger than the population mean
Population mean is a constant parameter, but sample mean is a random variable
Sampling distribution is distribution of sample means
![Page 36: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/36.jpg)
Example
Mean age of all students in the building is population mean
Each classroom has a sample mean Distribution of sample means from all
classrooms is sampling distribution
![Page 37: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/37.jpg)
Central Limit Theorem (CLT)
If population standard deviation is known, sampling distribution of sample means is normal if n > 30
CLT applies even if original population is skewed
![Page 38: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/38.jpg)
Top Ten #5
Expected Value
![Page 39: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/39.jpg)
Expected Value
Expected Value = E(x) = ΣxP(x)
= x1P(x1) + x2P(x2) +…
Expected value is a weighted average, also a long-run average
![Page 40: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/40.jpg)
Example
Find the expected age at high school graduation if 11 were 17 years old, 80 were 18 years old, and 5 were 19 years old
Step 1: 11+80+5=96
![Page 41: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/41.jpg)
Step 2
x P(x) x P(x)
17 11/96=.115 17(.115)=1.955
18 80/96=.833 18(.833)=14.994
19 5/96=.052 19(.052)=.988
E(x)= 17.937
![Page 42: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/42.jpg)
Top Ten #4
Linear Regression
![Page 43: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/43.jpg)
Linear Regression
Regression equation: =dependent variable=predicted value x= independent variable b0=y-intercept =predicted value of y if x=0
b1=slope=regression coefficient
=change in y per unit change in x
xy bb 10ˆ
y
![Page 44: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/44.jpg)
Slope vs Correlation
Positive slope (b1>0): positive correlation between x and y (y increase if x increase)
Negative slope (b1<0): negative correlation (y decrease if x increase)
Zero slope (b1=0): no correlation(predicted value for y is mean of y), no linear relationship between x and y
![Page 45: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/45.jpg)
Simple Linear Regression
Simple: one independent variable, one dependent variable
Linear: graph of regression equation is straight line
![Page 46: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/46.jpg)
Example
y = salary (female manager, in thousands of dollars)
x = number of children n = number of observations
![Page 47: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/47.jpg)
Given Data
x y
2 48
1 52
4 33
![Page 48: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/48.jpg)
Totals
x y
2 48
1 52
4 33 n=3
Sum=7 Sum=133
![Page 49: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/49.jpg)
Slope (b1) = -6.5
Method of Least Squares formulas not on BUS 302 exam
b1= -6.5 given
Interpretation: If one female manager has 1 more child than another, salary is $6,500 lower; that is, salary of female managers is expected to decrease by -6.5 (in thousand of dollars) per child
![Page 50: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/50.jpg)
Intercept (b0)
33.23
7
n
xx 33.44
3
133
n
yy
b0 = 44.33 – (-6.5)(2.33) = 59.5
If number of children is zero, expected salary is $59,500
xy bb 10
![Page 51: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/51.jpg)
Regression Equation
xy 5.65.59ˆ
![Page 52: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/52.jpg)
Forecast Salary If 3 Children
59.5 –6.5(3) = 40
$40,000 = expected salary
![Page 53: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/53.jpg)
xforecasty bb 10ˆ
yyerror ˆ
2
)ˆ(
2
2
n
yy
n
SSES
Standard Error of Estimate
![Page 54: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/54.jpg)
Standard Error of Estimate
(1)=x (2)=y (3) = 59.5-6.5x
(4)=
(2)-(3)
2 48 46.5 1.5 2.25
1 52 53 -1 1
4 33 33.5 -.5 .25
SSE=3.5
y 2)ˆ( yy
![Page 55: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/55.jpg)
9.15.323
5.3
S
Standard Error of Estimate
Actual salary typically $1,900 away from expected salary
![Page 56: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/56.jpg)
Coefficient of Determination
R2 = % of total variation in y that can be explained by variation in x
Measure of how close the linear regression line fits the points in a scatter diagram
R2 = 1: max. possible value: perfect linear relationship between y and x (straight line)
R2 = 0: min. value: no linear relationship
![Page 57: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/57.jpg)
Sources of Variation (V)
Total V = Explained V + Unexplained V SS = Sum of Squares = V Total SS = Regression SS + Error SS SST = SSR + SSE SSR = Explained V, SSE = Unexplained
![Page 58: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/58.jpg)
Coefficient of Determination
R2 = SSR SST
R2 = 197 = .98 200.5
Interpretation: 98% of total variation in salary can be explained by variation in number of children
![Page 59: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/59.jpg)
0 < R2 < 1
0: No linear relationship since SSR=0 (explained variation =0)
1: Perfect relationship since SSR = SST (unexplained variation = SSE = 0), but does not prove cause and effect
![Page 60: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/60.jpg)
R=Correlation Coefficient
Case 1: slope (b1) < 0 R < 0 R is negative square root of coefficient of
determination
2RR
![Page 61: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/61.jpg)
Our Example
Slope = b1 = -6.5 R2 = .98 R = -.99
![Page 62: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/62.jpg)
Case 2: Slope > 0
R is positive square root of coefficient of determination
Ex: R2 = .49 R = .70 R has no interpretation R overstates relationship
![Page 63: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/63.jpg)
Caution
Nonlinear relationship (parabola, hyperbola, etc) can NOT be measured by R2
In fact, you could get R2=0 with a nonlinear graph on a scatter diagram
![Page 64: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/64.jpg)
Summary: Correlation Coefficient
Case 1: If b1 > 0, R is the positive square root of the coefficient of determination Ex#1: y = 4+3x, R2=.36: R = +.60
Case 2: If b1 < 0, R is the negative square root of the coefficient of determination Ex#2: y = 80-10x, R2=.49: R = -.70
NOTE! Ex#2 has stronger relationship, as measured by coefficient of determination
![Page 65: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/65.jpg)
Extreme Values
R=+1: perfect positive correlation
R= -1: perfect negative correlation
R=0: zero correlation
![Page 66: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/66.jpg)
MS Excel Output
Correlation Coefficient (-0.9912): Note that you need to change the sign because the sign of slope (b1) is negative (-6.5)
Coefficient of Determination
Standard Error of Estimate
Regression Coefficient
![Page 67: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/67.jpg)
Top Ten #6
What Distribution to Use?
![Page 68: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/68.jpg)
Use Binomial Distribution If:
Random variable (x) is number of successes in n trials
Each trial is success or failure Independent trials Constant probability of success (π) on each trial Sampling with replacement (in practice, people
may use binomial w/o replacement, but theory is with replacement)
![Page 69: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/69.jpg)
Success vs. Failure
The binomial experiment can result in only one of two possible outcomes:
Male vs. Female Defective vs. Non-defective Yes or No Pass (8 or more right answers) vs. Fail (fewer
than 8) Buy drink (21 or over) vs. Cannot buy drink
![Page 70: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/70.jpg)
Binomial Is Discrete
Integer values 0,1,2,…n Binomial is often skewed, but may be symmetric
![Page 71: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/71.jpg)
Normal Distribution
Continuous, bell-shaped, symmetric Mean=median=mode Measurement (dollars, inches, years) Cumulative probability under normal curve : use
Z table if you know population mean and population standard deviation
Sample mean: use Z table if you know population standard deviation and either normal population or n > 30
![Page 72: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/72.jpg)
t Distribution
Continuous, mound-shaped, symmetric Applications similar to normal More spread out than normal Use t if normal population but population
standard deviation not known Degrees of freedom = df = n-1 if estimating the
mean of one population t approaches z as df increases
![Page 73: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/73.jpg)
Normal or t Distribution?
Use t table if normal population but population standard deviation (σ) is not known
If you are given the sample standard deviation (s), use t table, assuming normal population
![Page 74: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/74.jpg)
Top Ten #3
Confidence Intervals: Mean and Proportion
![Page 75: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/75.jpg)
Confidence Interval
A confidence interval is a range of values within which the population parameter is expected to occur.
![Page 76: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/76.jpg)
Factors for Confidence Interval
The factors that determine the width of a confidence interval are:
1. The sample size, n2. The variability in the population, usually
estimated by standard deviation.3. The desired level of confidence.
![Page 77: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/77.jpg)
Confidence Interval: Mean
Use normal distribution (Z table if):
population standard deviation (sigma) known and either (1) or (2):(1) Normal population
(2) Sample size > 30
![Page 78: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/78.jpg)
Confidence Interval: Mean
If normal table, then
nz
n
x
![Page 79: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/79.jpg)
Normal Table
Tail = .5(1 – confidence level) NOTE! Different statistics texts have different
normal tables This review uses the tail of the bell curve Ex: 95% confidence: tail = .5(1-.95)= .025 Z.025 = 1.96
![Page 80: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/80.jpg)
Example
n=49, Σx=490, σ=2, 95% confidence
9.44 < µ < 10.56
56.01049
296.1
49
490
![Page 81: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/81.jpg)
One of SOM professors wants to estimate the mean number of hours worked per week by students. A sample of 49 students showed a mean of 24 hours. It is assumed that the population standard deviation is 4 hours. What is the population mean?
Another Example
![Page 82: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/82.jpg)
95 percent confidence interval for the population mean.
12.100.2449
496.100.2496.1
n
X
The confidence limits range from 22.88 to 25.12. We estimate with 95 percent confidence that the average number of hours worked per week by students lies between these two values.
Another Example – cont’d
![Page 83: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/83.jpg)
Confidence Interval: Mean t distribution
Use if normal population but population standard deviation (σ) not known
If you are given the sample standard deviation (s), use t table, assuming normal population
If one population, n-1 degrees of freedom
![Page 84: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/84.jpg)
n
s
n
xtn 1
Confidence Interval: Mean t distribution
![Page 85: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/85.jpg)
Confidence Interval: Proportion Use if success or failure
(ex: defective or not-defective, satisfactory or unsatisfactory)
Normal approximation to binomial ok if (n)(π) > 5 and (n)(1-π) > 5, wheren = sample sizeπ= population proportionNOTE: NEVER use the t table if proportion!!
![Page 86: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/86.jpg)
Confidence Interval: Proportion
Ex: 8 defectives out of 100, so p = .08 and
n = 100, 95% confidence
n
ppzp
)1(
05.08. 100
)92)(.08.0(96.108.
![Page 87: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/87.jpg)
Confidence Interval: Proportion
A sample of 500 people who own their house revealed that 175 planned to sell their homes within five years. Develop a 98% confidence interval for the proportion of people who plan to sell their house within five years.
0497.35. 500
)65)(.35(.33.235.
35.0500
175p
![Page 88: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/88.jpg)
Interpretation
If 95% confidence, then 95% of all confidence intervals will include the true population parameter
NOTE! Never use the term “probability” when estimating a parameter!! (ex: Do NOT say ”Probability that population mean is between 23 and 32 is .95” because parameter is not a random variable. In fact, the population mean is a fixed but unknown quantity.)
![Page 89: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/89.jpg)
Point vs Interval Estimate
Point estimate: statistic (single number) Ex: sample mean, sample proportion Each sample gives different point estimate Interval estimate: range of values Ex: Population mean = sample mean + error Parameter = statistic + error
![Page 90: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/90.jpg)
Width of Interval
Ex: sample mean =23, error = 3 Point estimate = 23 Interval estimate = 23 + 3, or (20,26) Width of interval = 26-20 = 6 Wide interval: Point estimate unreliable
![Page 91: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/91.jpg)
Wide Confidence Interval If
(1) small sample size(n)
(2) large standard deviation
(3) high confidence interval (ex: 99% confidence interval wider than 95% confidence interval)
If you want narrow interval, you need a large sample size or small standard deviation or low confidence level.
![Page 92: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/92.jpg)
Top Ten #7
P-value
![Page 93: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/93.jpg)
P-value
P-value = probability of getting a sample statistic as extreme (or more extreme) than the sample statistic you got from your sample, given that the null hypothesis is true
![Page 94: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/94.jpg)
P-value Example: one tail test
H0: µ = 40
HA: µ > 40 Sample mean = 43 P-value = P(sample mean > 43, given H0 true) Meaning: probability of observing a sample mean
as large as 43 when the population mean is 40 How to use it: Reject H0 if p-value < α
(significance level)
![Page 95: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/95.jpg)
Two Cases
Suppose α = .05 Case 1: suppose p-value = .02, then reject H0
(unlikely H0 is true; you believe population mean > 40)
Case 2: suppose p-value = .08, then do not reject H0 (H0 may be true; you have reason to believe that the population mean may be 40)
![Page 96: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/96.jpg)
P-value Example: two tail test
H0 : µ = 70
HA: µ ≠ 70 Sample mean = 72 If two-tails, then P-value =
2 P(sample mean > 72)=2(.04)=.08
If α = .05, p-value > α, so do not reject H0
![Page 97: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/97.jpg)
Top Ten #2
Hypothesis Testing
![Page 98: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/98.jpg)
Population mean=µ Population proportion=π A statement about the value of a population
parameter Never include sample statistic (such as, x-
bar) in hypothesis
H0: Null Hypothesis
![Page 99: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/99.jpg)
HA or H1: Alternative Hypothesis
ONE TAIL ALTERNATIVE– Right tail: µ>number(smog ck)
π>fraction(%defectives)
– Left tail: µ<number(weight in box of crackers)
π<fraction(unpopular President’s % approval low)
![Page 100: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/100.jpg)
One-Tailed Tests
A test is one-tailed when the alternate hypothesis, H1 or HA, states a direction, such as:
• H1: The mean yearly salaries earned by full-time employees is more than $45,000. (µ>$45,000)
• H1: The average speed of cars traveling on freeway is less than 75 miles per hour. (µ<75)
• H1: Less than 20 percent of the customers pay cash for their gasoline purchase. (π <0.2)
![Page 101: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/101.jpg)
Two-Tail Alternative
Population mean not equal to number (too hot or too cold)
Population proportion not equal to fraction (% alcohol too weak or too strong)
![Page 102: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/102.jpg)
Two-Tailed Tests
A test is two-tailed when no direction is specified in the alternate hypothesis
• H1: The mean amount of time spent for the Internet is not equal to 5 hours. (µ 5).
• H1: The mean price for a gallon of gasoline is not equal to $2.54. (µ ≠ $2.54).
![Page 103: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/103.jpg)
Reject Null Hypothesis (H0) If
Absolute value of test statistic* > critical value* Reject H0 if |Z Value| > critical Z
Reject H0 if | t Value| > critical t
Reject H0 if p-value < significance level (alpha) Note that direction of inequality is reversed!
Reject H0 if very large difference between sample statistic and population parameter in H0
* Test statistic: A value, determined from sample information, used to determine whether or not to reject the null hypothesis.
* Critical value: The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected.
![Page 104: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/104.jpg)
Example: Smog Check
H0 : µ = 80
HA: µ > 80 If test statistic =2.2 and critical value = 1.96,
reject H0, and conclude that the population mean is likely > 80
If test statistic = 1.6 and critical value = 1.96, do not reject H0, and reserve judgment about H0
![Page 105: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/105.jpg)
Type I vs Type II Error
Alpha=α = P(type I error) = Significance level = probability that you reject true null hypothesis
Beta= β = P(type II error) = probability you do not reject a null hypothesis, given H0 false
Ex: H0 : Defendant innocent α = P(jury convicts innocent person) β =P(jury acquits guilty person)
![Page 106: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/106.jpg)
Type I vs Type II Error
H0 true H0 false
Reject H0 Alpha =α =
P(type I error)
1 – β (Correct Decision)
Do not reject H0 1 – α (Correct Decision)
Beta =β =
P(type II error)
![Page 107: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/107.jpg)
Example: Smog Check
H0 : µ = 80
HA: µ > 80
If p-value = 0.01 and alpha = 0.05, reject H0, and conclude that the population mean is likely > 80
If p-value = 0.07 and alpha = 0.05, do not reject H0, and reserve judgment about H0
![Page 108: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/108.jpg)
Test Statistic
When testing for the population mean from a large sample and the population standard deviation is known, the test statistic is given by:
zX
/ n
![Page 109: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/109.jpg)
The processors of Best Mayo indicate on the label that the bottle contains 16 ounces of mayo. The standard deviation of the process is 0.5 ounces. A sample of 36 bottles from last hour’s production showed a mean weight of 16.12 ounces per bottle. At the .05 significance level, can we conclude that the mean amount per bottle is greater than 16 ounces?
Example
![Page 110: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/110.jpg)
1. State the null and the alternative hypotheses:
H0: μ = 16, H1: μ > 16
3. Identify the test statistic. Because we know the population standard deviation, the test statistic is z.
4. State the decision rule.
Reject H0 if |z|> 1.645 (= z0.05)
2. Select the level of significance. In this case, we selected the .05 significance level.
Example – cont’d
![Page 111: Review of Top 10 Concepts in Statistics (reordered slightly for review the interactive session)](https://reader036.vdocuments.pub/reader036/viewer/2022081519/56813f47550346895da9fc6e/html5/thumbnails/111.jpg)
5. Compute the value of the test statistic
44.1365.0
00.1612.16
n
Xz
6. Conclusion: Do not reject the null hypothesis. We cannot conclude the mean is greater than 16 ounces.
Example – cont’d