validity

47
Validity and Reliability

Upload: mums1

Post on 13-Nov-2014

672 views

Category:

Health & Medicine


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Validity

Validity and Reliability

Page 2: Validity

Validity

Page 3: Validity

How well a survey measures what it sets out to measure .

Validity can be determined only if there is a reference procedure of “gold standard.”

food diaries Food–frequency questionnaires

hospital record. Birth weight

Definition

Page 4: Validity

validity

Three method:1 - content validity

2-criterion –related validity3-construct validity

Page 5: Validity

Screening test

Validity – get the correct result

Sensitive – correctly classify cases

Specificity – correctly classify non-cases

[screening and diagnosis are not identical]

Page 6: Validity

Validity: 1) Sensitivity

Probability (proportion) of correct classification of cases

Cases found / all cases

Page 7: Validity

Validity: 2) Specificity

Probability (proportion) of correct classification of noncases

Noncases identified / all noncases

Page 8: Validity

O

OO

OO

O

O

OO

O

O

2 cases / month

O

Page 9: Validity

O

OO

OO

O

O

O

O

O

Pre-detectable preclinical clinical old

OO

OO

O

Page 10: Validity

O

OO

OOO

O

O

OO

O

O

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

Pre-detectable pre-clinical clinical old

Page 11: Validity

O

OO

OOO

O

O

OO

O

O

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

What is the prevalence of “the condition?”

Page 12: Validity

Sensitivity of a screening test

Probability (proportion) of correct classification of detectable, pre-clinical cases

Page 13: Validity

O

OO

OOO

O

O

OO

O

O

Pre-detectable pre-clinical clinical old (8) (10) (6) (14)

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

Page 14: Validity

O

OO

OOO

O

O

OO

O

O

Correctly classifiedSensitivity: ––––––––––––––––––––––––––– Total detectable pre-clinical (10)

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

Page 15: Validity

Specificity of a screening test

Probability (proportion) of correct classification of noncases

Noncases identified / all noncases

Page 16: Validity

O

OO

OOO

O

O

OO

O

O

Pre-detectable pre-clinical clinical old (8) (10) (6) (14)

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

Page 17: Validity

O

OO

OOO

O

O

OO

O

O

Correctly classifiedSpecificity: –––––––––––––––––––––––––––––

Total non-cases (& pre-detect) (162 or 170)

OO

O

O

O

OO

O

O

OO

OO

O

O

OO

OO

O

O

O OO

OO

Page 18: Validity

Truepositive

Truenegative

Falsepositive

Falsenegative

Sensitivity = True positives

All cases

a + c b + d

= aa + c

Specificity = True negatives All non-cases = d

b + d

a + b

c + d

True Disease Status

Cases Non-cases

Positive

Negative

ScreeningTest

Results

a d b

c

Page 19: Validity

True Disease Status

Cases Non-cases

Positive

Negative

ScreeningTest

Results

a d

1,000 b

c60

Sensitivity = True positives

All cases

200 20,000

= 140200

Specificity = True negatives All non-cases

= 19,00020,000

1,140

19,060

140

19,000

=

= 70%

95%

Page 20: Validity

Interpreting test results: predictive value

Probability (proportion) of those tested who are correctly classified

Cases identified / all positive tests

Noncases identified / all negative tests

Page 21: Validity

Truepositive

Truenegative

Falsepositive

Falsenegative

PPV = True positivesAll positives

a + c b + d

= a

a + b

NPV = True negatives All negatives

= dc + d

a + b

c + d

True Disease Status

Cases Non-cases

Positive

Negative

ScreeningTest

Results

a d b

c

Page 22: Validity

True Disease Status

Cases Non-cases

Positive

Negative

ScreeningTest

Results

a d

1,000 b

c60

PPV =True positivesAll positives

200 20,000

= 1401,140

NPV = True negatives All negatives

= 19,00019,060

1,140

19,060

140

19,000

=

= 12.3%

99.7%

Page 23: Validity

Confidence interval

point estimate+_[1.96*SE(estimate)]SE(sensivity)=√P(1-P)

NSE=0.013

0.70(-1.96*0.013=)0.670.70(+1.96*0.013=)0.95

Page 24: Validity

Receiver operating characteristic (ROC) curve

Not aIl tests give a simple yes/no result. Some yield results that are numerical values along a continuous scale of measurement. in these situations, high sensitivity is obtained at the

cost of low specificity and vice versa

Page 25: Validity

Reliability

Page 26: Validity

Reliability

Repeatability – get same result Each time

From each instrument From each rater

If don’t know correct result, then can examine reliability only .

Page 27: Validity

Definition

The degree of stability exhibited when a measurement is repeated under identical conditions

Lack of reliability may arise from divergences between observers or instruments of measurement or instability of the attribute being measured (from Last. Dictionary of Epidemiology.

Page 28: Validity
Page 29: Validity

Assessment of reliability

Test-Retest ReliabilityEquivalence

Internal Consistency

spss

Reliability: Kappa

Page 30: Validity

EXAMPLE OF PERCENT AGREEMENT

Two physicians are each given a set of 100 X-rays to look at independently and asked to judge whether pneumonia is present or absent. When both sets of diagnoses are tallied, it is found that 95%

of the diagnoses are the same .

Page 31: Validity

IS PERCENT AGREEMENT GOOD ENOUGH?

Do these two physicians exhibit high diagnostic reliability ?

Can there be 95% agreement between two observers without really having

good reliability ?

Page 32: Validity

Compare the two tables below:

Table 1Table 2MD#1

Yes No

MD#2Yes 1 3

No 2 94

MD#1

Yes No

MD#2Yes 43 3

No 2 52

In both instances, the physicians agree 95% of the time. Are the two physicians

equally reliable in the two tables ?

MD#1

Yes No

MD#2Yes 43 3

No 2 52

Page 33: Validity

USE OF THE KAPPA STATISTIC TO ASSESS RELIABILITY

Kappa is a widely used test of inter or intra-observer agreement (or reliability) which corrects for chance agreement.

Page 34: Validity

KAPPA VARIES FROM + 1 to - 1 +1 means that the two observers are perfectly

reliable. They classify everyone exactly the same way.

0 means there is no relationship at all between the two observer’s classifications, above the

agreement that would be expected by chance .

-1 means the two observers classify exactly the opposite of each other. If one observer says yes, the other always says no.

Page 35: Validity

GUIDE TO USE OF KAPPAS IN EPIDEMIOLOGY AND MEDICINE

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

Page 36: Validity

WAY TO CALCULATE KAPPA

1 .Calculate observed agreement (cells in which the observers agree/total cells). In both table 1 and table 2 it is 95%

 

2 .Calculate expected agreement (chance agreement) based on the marginal totals

Page 37: Validity

Table 1’s marginal totals are:

OBSERVEDMD#1

Yes No

MD#2Yes 1 3 4

No 2 94 96

3 97 100

Page 38: Validity

•How do we calculate the N

expected by chance in each cell?

•We assume that each cell should

reflect the marginal distributions, i.e. the proportion of

yes and no answers should be the same within the four-fold

table as in the marginal totals.

OBSERVED MD #1

Yes No

MD#2 Yes 1 3 4

No 2 94 96

3 97 100

EXPECTED MD #1

Yes No

MD#2 Yes 4

No 96

3 97 100

Page 39: Validity

To do this, we find the proportion of answers in either the column (3% and 97%, yes and no respectively for

MD #1) or row (4% and 96% yes and no respectively for MD #2) marginal totals, and apply one of the two

proportions to the other marginal total. For example, 96% of the row totals are in the “No” category.

Therefore, by chance 96% of MD #1’s “No’s” should also be in the “No” column. 96% of 97 is 93.12 .

EXPECTEDMD#1

Yes No

MD#2 Yes 4

No 93.12 96

3 97 100

Page 40: Validity

By subtraction, all other cells fill in automatically, and each yes/no distribution reflects the marginal distribution. Any cell could have been used to make the calculation, because once one cell is specified in a 2x2 table with fixed marginal distributions, all other cells are also specified.

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Page 41: Validity

Now you can see that just by the operation of chance, 93.24 of the 100 observations should have been agreed to by the two observers. (93.12 + 0.12)

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Page 42: Validity

Below is the formula for calculating Kappa from expected agreement

Observed agreement - Expected Agreement

1 - Expected Agreement

 

95% - 93.24% = 1.76%. = 26

1 - 93.24% 6.76%

Page 43: Validity

How good is a Kappa of 0.26?

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

Page 44: Validity

In the second example, the observed agreement was also 95%, but the marginal totals were very different

 

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 54

45 55 100

Page 45: Validity

Using the same procedure as before, we calculate the expected N in any one cell, based on the marginal totals. For example, the lower

right cell is 54% of 55, which is 29.7 

 

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 29.7 54

45 55 100

Page 46: Validity

And, by subtraction the other cells are as below. The cells which indicate agreement are highlighted in yellow, and add up to 50.4%

ACTUAL MD #1

Yes No

MD#2 Yes 20.7 25.3 46

No 24.3 29.7 54

45 55 100

Page 47: Validity

Enter the two agreements into the formula: 

Observed agreement - Expected Agreement 1 - Expected Agreement

 95% - 50.4% = 44.6%. = 90

1 - 50.4% 49.6% 

In this example, the observers have the same % agreement, but now they are

much different from chance .Kappa of 0.90 is considered excellent