kevin cummins statistical phenomenon 02-09-16
TRANSCRIPT
1
Interesting Statistical Phenomenon
San Diego State UniversityDSW/IRSU Brown Bag
3/16
Kevin Cummins
2
Definitions• Principle: a comprehensive and
fundamental law, doctrine, or assumption
• Fallacy: a false or mistaken idea• Paradox: a statement that is seemingly
contradictory or opposed to common sense and yet is perhaps true
3
Outline
• Objective• Simpson’s Paradox• Will Roger’s Paradox • Lord’s Paradox • Berkson’s Paradox• Monte Hall Paradox • Others
4
Objectives
• Create awareness of several statistical issues that might arise during observational research
• Sneak in an introduction to mosaic plots• Learn how to win prizes on game shows
5
Outline
• Objective• Simpson’s Paradox• Will Roger’s Paradox• Lord’s Paradox • Berkson’s Paradox• Monte Hall Paradox• Others
6
Delayed On Time
AlaskaAirline
17813%
1,33888%
AmericaWest
66110%
5,80490%
Which Airline Should You Fly?
Cells contain counts and row %
7
8
Alaska Airlines America West
.11.05
.17 .14.08
.29
9
Simpson’s Paradox
Occurs when the relationship between two (categorical) variables is reversed after a third variables is considered.
The relationship between two variables differs within subgroups compared to that observed for the aggregated data.
10
Simpson’s Paradox: Remedies/Responses
Study DesignUse ExperimentsCollect appropriate covariate data
Know the Research SystemCollect appropriate covariate dataAnalytically introduce conditionals
(i.e. moderators/covariates)Use appropriate interpretations
11
Outline
• Objective• Simpson’s Fallacy • Will Roger’s Paradox • Lord’s Paradox• Berkson’s Paradox• Monte Hall Paradox• Others
WRP: Health Insurance Example
1996 1997
HMO $98/Subscriber $119/Subscriber
PPO $126/Subscriber $142/Subscriber
PPO No Longer Free
Young et al. 1999
Cells are cost to employer (a hospital system)
Expected Lower Expenditures
13
Will Roger’s Paradox
“When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states.”
IC: uspsstamps.com
14
The Will Rogers Paradox (WRP) is observed when moving an element from one set to another set the mean values of both sets change in the same direction.
The effect will occur when both of these conditions are met:
1. The element being moved is below average for its current set.
2. The element being moved is above the current average of the set it is entering.
15
WRP: Effect of Shifting One Observation
WRP: Health Insurance Example
1996 1997
HMO $98/Subscriber $119/Subscriber
PPO $126/Subscriber $142/Subscriber
The 1997 migration moved lower utilization PPO subscribers into the HMO
Young et al. 1999
Low use
High use
17
Will Rogers: Remedies/Responses
Know Your SystemIn This Case:
Statistically adjust/stratify for baseline costs
18
Outline
• Objective• Simpson’s Fallacy• Will Roger’s Paradox• Lord’s Paradox • Berkson’s Paradox • Monte Hall Paradox• Others
19
Lord’s Paradox
• Occurs in situations where change score analysis and ANCOVA yield apparently conflicting results
20
An Extreme Example• Assessment of a supplemental educational program
• 10 schools, 5 schools opted into the programs (free-choice)
• 1 student from each school assessed
• Pre and post assessments given
• No random/sampling/measurement error (simplified)
22
Two StatisticiansStatistician One• Calculates
difference scores for each group
• Change scores are the same for both groups
Statistician Two• Adjusts for initial
score• Finds group
differences
23
Two StatisticiansPaired t-TestStatistician One
Data: group 1 vs. group 2
t = -0.002, df = 299, p-value = 0.99
ANCOVAStatistician Two
Coefficients: Value Pr(>|t|) (Intercept) 15.0 0.00 Pre 0.5 0.00 Group 20.0 0.00
25
Lord’s Paradox: Remedies/Responses
“With the data available…there is no logical or statistical procedure that can be counted on to make allowances for pre-existing conditions between groups.” Frederic Lord
•Know your system– Match your samples
•Use the best descriptive statement(s) that match your questions•Use and report multiple approaches (Wright 2006)
•Graph your data
26
Outline
• Objective• Simpson’s Fallacy• Will Roger’s Paradox• Lord’s Principle• Berkson’s Paradox• Monte Hall Paradox • Others
27
Berkson’s Paradox
An association reported from a hospital case-control study can be distorted
If cases and controls experience differential hospital admission rates with respect to the suspected causal factor
28
Typical Berkson Scenario
Example from Roberts et al. 1978
Investigated the relationship between circulatory and respiratory disease.
Sampled the general population and hospital populations.
29OR = 3.9 [95% CI: 1.4-10.9]
Circ
ulat
ory
Dis
ease
30
Circ
ulat
ory
Dis
ease
OR = 1.3 [95% CI: 0.9-2.3]
31
Berkson Example
Example from Lilienfeld and Stolley (1994)• No greater admission rate for subjects
with multiple conditions• Different rates of admission for cases and
controls• Results in an apparent association
between two conditions
Disease B
Disease A Case Control
Case 200 200
Control 800 800
Total 1000 1000
% with A 20 20
Disease B
Disease A Case Control
Case
Control
Total
% with A
110 17080 560
Community Hospital
P(H|A)=.50
P(H|B)=.10 P(H|!B)=70
100X .50X .10X .50X .70
100
X .70X .10
190 73058 23
OR=1 OR=4.5
33
Berkson’s: Remedies/Responses
– There is no safe analytical mitigation– Analysis of potential bias
• know your system• Sensitivity analysis
– Limit conclusions– Utilize multiple control pools– Consider alternative study design
34
35
Monty Hall Paradox
36
Miller et al. 1989
37
Review
Big PictureUse care to interpret observational studies
Know your system
Conditional ResponsesSimpson’sLord’sWill Roger’s
Perspective ProblemsBerkson’sMonte Hall
Doctor Tyrano, Look for a Covariate!
Cartoon used with with permission
39
Benford’s Law
Ones are the most common leading digit in most data.
Notice that if a data entry (base 10) begins with a 1, the entry has to be at least doubled to have a first significant digit of 2. However, if a leading digit begins with a 9, it only has to be increased by, at most, 11% to change the first significant digit into a 1.
40
Lindley’s Paradox
• Standard Sampling Theory VS. Bayesian Theory
Under some circumstances strong evidence against the null hypothesis doesn’t result in the null being rejected