registry i/m ratio (npcr) completeness a comparison … c o m p l e t e registry i/m ratio (npcr)...

1
Visit: www.cdc.gov | Contact CDC at: 1-800-CDC-INFO or www.cdc.gov/info The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. CS248056 National Center for Chronic Disease Prevention and Health Promotion Division of Cancer Prevention and Control OTHER METHODS BACKGROUND Research and policies meant to reduce the cancer burden depend on cancer surveillance data. Data must meet high standards of quality and reliability. Completeness of incident case ascertainment is an essential component of quality/reliability. Incomplete data results when registries are unable to collect accurate information on all incident cases in a defined geographic area within the given timeframe. Some missed initially, but collected later (delay). Some missed completely. Index of Completeness - Quantifies the percentage of actual incident cancer cases reported over a specific geographic area and time period. Estimating the truth (observed vs. expected) The actual (expected) number of incident cases is unobservable. The expected cases must be estimated from available data. FUTURE DIRECTIONS Find the best model! Accurate prediction Easy to implement at registry-level Collaborate with partners. CONTACT INFORMATION A. Blythe Ryerson, PhD, MPH [email protected] 770.488.2426 I/M Ratio (SEER) Assumes the ratio of age-adjusted incidence to mortality rates is constant across geographic areas for a given cancer site (19 sites), race (white & black only), and gender group Completeness indices weighted by race (white and black only) and gender and combined Fulton JP, Howe HL (1995) Evaluating the use of incidence-mortality ratios in estimating the completeness of cancer registration. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9 Roffers SDJ (1994) Case completeness and data quality assessments in central cancer registries and their relevance to cancer control. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9 I/M Ratio (NPCR) Can we improve estimates of expected incidence for NPCR registries by utilizing national NPCR incidence rates? Methods of Least Squares (Simple Linear Regression) A comparison of methods for assessing completeness of case ascertainment in data from the National Program of Cancer Registries A. Blythe Ryerson, PhD, MPH Cancer Surveillance Branch 2014 NAACCR Annual Conference Ottawa, Ontario, Canada June 21-26, 2014 I/M Ratio (SEER) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry I/M Ratio (NPCR) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry 1150000 1200000 1250000 1300000 1350000 1400000 1450000 Case Counts Diagnosis Year Estimate e x p e c t e d case count for 2012 by extrapolating the tted line O b s e r v e d case count for 2012 Where Y=case counts or incidence rates X= diagnosis year β 1 =slop of fitted LINEAR line SLR (Incidence) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Complete Registry Puerto Rico Completeness Variations on I/M Method 0 10 20 30 40 50 60 70 80 90 100 2006 2007 2008 2009 2010 2011 2012 % Completeness Submission Year SEER NPCR SEER-Hispanic Only NPCR-Hispanic Only Pros: Simple method Easy to implement Logical Cons: Assumes linear relationship Does not take into account correlation of data Would not identify “chronic under-reporters” No adjustments for confounding Modifications of I/M ratio Stratification or restriction by other race and/or ethnicities Example: Restriction of I/M ratio calculations to Hispanics only in the Puerto Rico data Multiple Linear Regression and/or Linear Transformation Allow for adjustment for confounding Allow non-linear relationships (transformation) Weight data for more recent years Limitations: Would not identify “chronic under- reporters” Have to identify confounders/residual confounding Doesn’t take into account correlated data Time Series/Dynamic Panel Models Models fit to time-series data Can be used to make predictions Often applied when data show evidence of non-stationarity (no trend-like behavior) Example: Autoregressive Integrated Moving Averages (ARIMA) Spatial Prediction Models Includes mortality rates – can be seen as an extension of I/M ratio method Incorporates information on: Geography Socio-demographics Health Lifestyle factors Also includes spatial random effects to enable better predictions in sparse data areas Das B, Clegg LX, Feuer EJ, Pickle LW. A new method to evaluate the completeness of case ascertainment by a cancer registry. Cancer Causes and Control, Vol. 19, No. 5 (jun., 2008), pp. 515-525. Pros: Simple method Easy to implement What is in place currently (NAACCR) Cons: Relies solely on mortality data (does not take into account screening rates or other factors known to influence cancer incidence rates) Uses national incidence rates from SEER only Makes use of only white and black race groups No variance estimates Pros: Simple method Easy to implement Cons: Relies solely on mortality data (does not take into account screening rates or other factors known to influence cancer incidence rates) Uses national incidence rates from NPCR only Makes use of only white and black race groups No variance estimates Completeness = Observed Expected Expected Incidence Rate = x Registry Mortality Rate ‘National’ Incidence Rate (SEER) National Mortality Rate Expected Incidence Rate = x Registry Mortality Rate ‘National’ Incidence Rate (NPCR) National Mortality Rate METHODS What’s the best method for estimating the truth? Method Comparison for NPCR November 2013 Submission, All Registries 0.00 20.00 40.00 60.00 80.00 100.00 120.00 % Completeness Submission Year I/M (SEER) I/M (NPCR) SLR (Incidence)

Upload: vuongnguyet

Post on 30-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Registry I/M Ratio (NPCR) Completeness A comparison … C o m p l e t e Registry I/M Ratio (NPCR) Completeness 2012 Submission, by Registry 0.00 20.00 40.00 60.00 80.00 100.00 120.00

Visit: www.cdc.gov | Contact CDC at: 1-800-CDC-INFO or www.cdc.gov/infoThe findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. CS248056

National Center for Chronic Disease Prevention and Health Promotion Division of Cancer Prevention and Control

OTHER METHODSBACKGROUND

� Research and policies meant to reduce the cancer burden depend on cancer surveillance data.

� Data must meet high standards of quality and reliability.

� Completeness of incident case ascertainment is an essential component of quality/reliability.

� Incomplete data results when registries are unable to collect accurate information on all incident cases in a defined geographic area within the given timeframe.

– Some missed initially, but collected later (delay).

– Some missed completely. � Index of Completeness - Quantifies the

percentage of actual incident cancer cases reported over a specific geographic area and time period.

� Estimating the truth (observed vs. expected)

– The actual (expected) number of incident cases is unobservable.

– The expected cases must be estimated from available data.

FUTURE DIRECTIONS

� Find the best model!

– Accurate prediction – Easy to implement at registry-level

� Collaborate with partners.

CONTACT INFORMATIONA. Blythe Ryerson, PhD, MPH [email protected] 770.488.2426

I/M Ratio (SEER)

� Assumes the ratio of age-adjusted incidence to mortality rates is constant across geographic areas for a given cancer site (19 sites), race (white & black only), and gender group

� Completeness indices weighted by race (white and black only) and gender and combined

Fulton JP, Howe HL (1995) Evaluating the use of incidence-mortality ratios in estimating the completeness of cancer registration. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9

Roffers SDJ (1994) Case completeness and data quality assessments in central cancer registries and their relevance to cancer control. In: Howe HL (ed) Cancer incidence in North America, 1988-1990. North American Association of Central Cancer Registries. Springfield, IL, pp V1 – V9

I/M Ratio (NPCR)

� Can we improve estimates of expected incidence for NPCR registries by utilizing national NPCR incidence rates?

Methods of Least Squares (Simple Linear Regression)

A comparison of methods for assessing completeness of case ascertainment in data from the National Program of Cancer RegistriesA. Blythe Ryerson, PhD, MPH

Cancer Surveillance Branch2014 NAACCR Annual Conference Ottawa, Ontario, Canada June 21-26, 2014

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line

Pros: � Simple method

� Easy to implement

� Logical

Cons: � Assumes linear relationship

� Does not take into account correlation of data

� Would not identify “chronic under-reporters”

� No adjustments for confounding

Modifications of I/M ratio � Stratification or restriction by other race

and/or ethnicities

� Example: Restriction of I/M ratio calculations to Hispanics only in the Puerto Rico data

Multiple Linear Regression and/or Linear Transformation

� Allow for adjustment for confounding

� Allow non-linear relationships (transformation)

� Weight data for more recent years

� Limitations:

– Would not identify “chronic under-reporters”

– Have to identify confounders/residual confounding

– Doesn’t take into account correlated data

Time Series/Dynamic Panel Models � Models fit to time-series data

� Can be used to make predictions

� Often applied when data show evidence of non-stationarity (no trend-like behavior)

� Example: Autoregressive Integrated Moving Averages (ARIMA)

Spatial Prediction Models � Includes mortality rates – can be seen as an

extension of I/M ratio method

� Incorporates information on:

– Geography – Socio-demographics – Health – Lifestyle factors

� Also includes spatial random effects to enable better predictions in sparse data areas

Das B, Clegg LX, Feuer EJ, Pickle LW. A new method to evaluate the completeness of case ascertainment by a cancer registry. Cancer Causes and Control, Vol. 19, No. 5 (jun., 2008), pp. 515-525.

Pros: � Simple method

� Easy to implement

� What is in place currently (NAACCR)

Cons: � Relies solely on mortality data (does not take into

account screening rates or other factors known to influence cancer incidence rates)

� Uses national incidence rates from SEER only

� Makes use of only white and black race groups

� No variance estimates

Pros: � Simple method

� Easy to implement

Cons: � Relies solely on mortality data (does not take into account screening

rates or other factors known to influence cancer incidence rates)

� Uses national incidence rates from NPCR only

� Makes use of only white and black race groups

� No variance estimates

Completeness = ObservedExpected

Expected Incidence Rate = x Registry Mortality Rate‘National’ Incidence Rate (SEER)

National Mortality RateExpected Incidence Rate = x Registry Mortality Rate

‘National’ Incidence Rate (NPCR)

National Mortality Rate

METHODS

What’s the best method for estimating the truth?

I/M Ratio (SEER) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

I/M Ratio (NPCR) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method of Least Squares (Simple Linear Regression)

1150000

1200000

1250000

1300000

1350000

1400000

1450000

Cas

e C

ou

nts

Diagnosis Year

Estimate expected case count for 2012 by extrapolating the fitted line

Observed case count for 2012

SLR (Incidence) Completeness2012 Submission, by Registry

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

te

Registry

Method Comparison for NPCR November 2013 Submission, All Registries

0.00

20.00

40.00

60.00

80.00

100.00

120.00

% C

om

ple

ten

ess

Submission Year

I/M (SEER)

I/M (NPCR)

SLR (Incidence)

Other MethodsPuerto Rico CompletenessVariations on I/M Method

0

10

20

30

40

50

60

70

80

90

100

2006 2007 2008 2009 2010 2011 2012

% C

om

ple

ten

ess

Submission Year

SEER

NPCR

SEER-Hispanic Only

NPCR-Hispanic Only

Where Y=case counts or incidence ratesX= diagnosis yearβ1=slop of fitted LINEAR line