nested case-control studyhlm.tzuchi.com.tw/epi-stat/images/class/2016/2016_class7.pdf ·...

71
Nested case-control study 巢式病例對照研究法 慈濟大學 公共衛生研究所 謝佳容 助理教授 1

Upload: others

Post on 01-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

  • Nested case-control study

    巢式病例對照研究法

    慈濟大學 公共衛生研究所

    謝佳容 助理教授

    1

  • 2

    流行病學研究設計

    Grimes & Schulz, 2002

    •Survey•Case report

  • 1.病例對照研究(case-control study)

  • 病例對照研究

    以有病的人為對象,選取一組沒病的為對照組,比較兩組在暴露經驗上有無不同。

    用來發掘可能和疾病有關的各種因素,以便建立初步的流行病學假說,需要對大量疑似病因做廣泛的比較。

    適用於

    稀有疾病

    常見暴露4

  • 病例組的選擇

    病例組選擇

    在病例對照研究,病例組的選擇來源相當多樣,如醫

    院病患或診所病患,很多社區保存患有某特殊疾病如

    癌症的病患資料,此資料可作為選擇病例組的珍貴來

    源,但假如病例組的選擇是來自於單一醫院,研究結

    果可能無法概括推論於所有患有該疾病的病患,因此

    ,從社區的多家醫院選擇病例組是較為理想。

    5

  • 病例組的選擇

    疾病發生(Incident)或盛行(Prevalent)病例 在病例對照研究中,必須考慮是否以新診斷出某疾病

    的病患或以患有某疾病一段時間的病患作為病例組。

    盛行病例是指疾病早已診斷出一段時間者,因此較容

    易取得大量的研究對象。在病因學的研究,一般而言

    ,較偏愛使用疾病發生病例。

    6

  • 使用盛行(Prevalent)病例可能的問題

    7

    因為使用盛行(Prevalent)病例需要考慮存活率的問題。例如:罹患此病的病人都在診斷不久就死亡了,如此盛行病

    例就可能不足。此外,更重要的是,我們找到的盛行個案

    ,可能都是較存活時間較長的存活者,而這些病例的代表

    性可能會明顯不足。

  • 控制組的選擇

    控制組的選擇

    是病例對照研究最大的挑戰與困難

    選擇控制組的基本概念

    控制組除了在欲研究的疾病上不同外,其餘和病

    例組並無差異或控制組應該代表所有研究對象中

    未罹患該疾病的人,但實際上,我們通常無法知

    道研究對象中未罹患該疾病的人之特徵。

    8

  • 控制組的選擇

    對照組的選取,除了顧慮成本與可行性 (無病的個案相對來說不願參

    與研究) ,最重要的是必須盡可能挑選與病例組有相似的人口學特徵

    、干擾因子分佈相似的對照組。主要來源有四個:

    1. 社區中的一般民眾

    2. 與病例組同一醫療院所中就醫的其他病患

    3. 病例組的家屬

    4. 病例組的朋友或鄰居

    利用「配對」(matching)的方式–當募集到一個病例個案,則挑選其親屬、友人或鄰居當做對照個案,這些都是為了增加對

    照組與病例組的「可比較性」。9

  • 病例對照研究的優缺點

    優點

    病例容易取得,適合於稀有疾病研究

    時間短、經濟

    樣本較少(較橫斷性研究或長期追蹤研究樣本少)

    Suitable for disease with long induction and latent period

    缺點

    容易產生記憶偏差(recall bias)

    來自不同醫院或醫師的病例,診斷標準可能不同

    時序性推論較不確定

    容易產生selection bias

    Limited to a single outcome 10

  • 2.世代追蹤研究(Cohort study)

    Exposure Disease

    Best observational design to establish causal relation

  • Cohort study

    以健康人(指沒有觀察疾病的人)為對象,依其暴露經驗分

    為暴露組與非暴露組,追蹤一段長時間,看兩組的得病率

    是否不同。

    適用於

    罕見暴露

    常見疾病

    12

  • 世代追蹤研究 在世代追蹤研究(cohort study)中,研究者挑選所欲研究

    的世代,依照危險因子的暴露情形,將世代的成員非為

    暴露組與非暴露組,然後追蹤一段時間,觀察並比較兩

    組的發病情形。

    13

  • Population

    People without

    the outcome

    Exposed

    Not Exposed

    Diseased

    Not diseased

    Diseased

    Not diseased

    Cohort studies

    Time14

  • Cohort study

    15

    An epidemiology investigation that track people forward in

    time from exposure to outcome.

    Subjects initially without disease are classified by their

    exposure status and followed over time to determine the

    incidence.

    Usually the incidence of two cohorts, an “exposed” cohort and

    an “unexposed” cohort, are compared.

    (Grimes DA, 2002 )

  • 挑選世代的方式

    挑選研究成員的方式有兩種

    1. 挑選一個健康世代(如社區的居民、國中生等),依照成員

    的暴露情形,將世代歸類為暴露組與非暴露組。

    最著名的例子為美國「佛明罕心臟研究」(Framingham heart study)2. 當碰到特殊暴露時,直接挑出暴露世代當作暴露組與非暴

    露世代當作非暴露組。

    例如在石棉與肺癌的世代研究中,暴露族群挑選石棉工廠的員工

    ;在探討放射線與白血病相關的世代研究中,暴露族群可以挑選

    日本原子彈生還者或核能發電廠員工等。16

  • 挑選世代的方式

    17

    挑選世代是個學問,該世代要能充分了解研究的重要性,

    配合度高,且疾病在該世代當中有一定的發生頻率,而研

    究者也常會針對年齡、性別等條件進行限制,以增加研究

    效率。

    暴露資料可以從多種管道獲得,如利用問卷詢問個案暴露

    情形,或利用工作職場、醫療院所等相關記錄,至於一些

    生理資料,研究者可以透過體檢或其他測驗等方式取得。

  • 挑選世代的方式

    非暴露組的成員,則可依照研究目的,挑選一般民眾或特

    殊世代進行比較。為了增加暴露組與非暴露組的可比較性

    ,非暴露組除了未受到暴露之外,其他與暴露無關的人口

    學特徵應該與暴露組相似(如實驗型研究一樣,其他可能

    的影響因子應兩組一致),包括:種族、人口結構等。

    18

  • Cohort Studies之優缺點 優點

    可以辯證因果(時序性清楚)

    可評估疾病危險性,計算發生率及相對危險性

    可以減少訊息誤差,也不會有回憶偏差

    可探討多種疾病

    缺點

    費時、費錢

    容易產生個案漏失

    世代研究可能經過非常長時間的追蹤,將會面臨參與成員失聯或

    拒絕配合等退出研究的情形,研究者必須去評估退出研究的原因

    ,若與疾病發生有關,則可能帶來偏差。

    追蹤分母大

    不適用於稀有疾病19

  • 世代研究的困難

    20

    1. 失去追蹤

    由於世代研究經過非常長時間的追蹤,可能面臨參與成員失

    聯或拒絕配合等退出研究的情形,研究者必須去評估退出研

    究的原因,若與疾病發生有關,則可能帶來偏差。若是失去

    追蹤者偏向暴露組或非暴露組,那麼即便是追蹤率高達70-80%,仍有可能得到偏差的估計。

    可能解決方法:

    a) 比較失去追蹤者的暴露情形與未失去追蹤者是否相同

    b) 比較失去追蹤者在其他資料與未失去追蹤者是否相同,如年齡、性

    別、種族

  • 世代研究的困難

    21

    2. 診斷標準改變

    a) 診斷者不知個案是暴露組與非暴露組,以減少information bias

    b) 診斷方法及標準要先決定

    3. 需要大量的人力、時間與經費

    a) 人年的運用以增加觀察人數縮短觀察時間

    b) 採回溯性世代追蹤研究法縮短時間

  • Nested case-control study

    (重疊(巢式)病例對照研究法)

    Variants of study design

  • Nested case-control study(重疊(巢式)病例對照研究法)

    The strategy of the nested case-control design was introduced by Mantel in 1973.

    A case-control approach is employed within an established cohort.

    需像世代研究一樣選擇世代,蒐集初次暴露資料,但是所蒐集的暴

    露資料不像傳統世代研究,立即經過進一步處理或分析。

    等待追蹤一段時間後,有足夠的病例數發生後,由未發病的世代成

    員選取一個隨機樣本做為對照組。

    23

  • Nested case-control study(重疊(巢式)病例對照研究法)

    Cohort-based case-control study

    Nested case-control study (重疊(巢式)病例對照研究法)

    24

  • Nested Case-Control(重疊病例對照研究法)

    25

  • Nested case-control study(重疊(巢式)病例對照研究法)

    結合世代研究與病例對照研究法,兼具兩者的優點。

    選擇世代→收集基線暴露資料→進行追蹤,累積病例數→ 由未發病

    的世代成員中隨機選取對照個案。

    病例個案

    世代追蹤過程中新發病者

    對照個案

    於病例發生的時間點上尚未有疾病,但未來有發病危險性的人。

    具time-matching特性26

  • Nested case-control study(重疊(巢式)病例對照研究法)

    Controls are selected for each case from the individuals

    at risk at the time at which the case occurs.

    To adjust for possible confounding, it is common to

    match on other variables as well. This is achieved by

    selecting controls with the same values of the

    confounding variables as the case.

    27

  • 重疊(巢式)病例對照研究的優點 重疊(巢式)病例對照研究(nested case-control study)是融合病例對照研

    究及世代研究之優點的一種研究設計,其優點為:

    1. 保留了病例對照研究節省人力、物力的優勢,可探討罕見疾病。

    2. 具有世代研究確定暴露與疾病因果關係的優點,避免透過回憶來辨別暴

    露狀態。因為暴露資料可透過一開始收集研究對象的檢體來評估,可避

    免回憶偏差(recall bias)。3. 利用配對方式挑選對照個案,也提高了統計效率。

    4. 只需收集一部分沒病的人作為對照組,成本較低。

    5. 有病及沒病的人皆來自同一個組群,可避免選樣的偏差(selection bias)。28

  • 重疊(巢式)病例對照研究的優缺點 優點

    暴露測量不受疾病狀態影響

    減少選樣偏差

    與世代研究比較,可節省人力與經費

    與病例對照研究比較,因果時序性較佳

    缺點

    仍須進行追蹤以等待病例產生

    應用於生物標記研究

    需有較大的檢體儲存空間 29

  • 統計分析方法

    Conditional logistic regression was used to analyze

    nested case-control studies.

    30

  • 進行Nested case-control study的四步驟

  • 進行Nested case-control study的四步驟

    (1) Cohort time axis definition

    (2) Case definition and selection

    (3) Risk set definition and formation for all cases

    (4) Random selection of controls from each risk set

    32

    (Vidal Essebag, 2003)

  • (1) Cohort time axis definition Generally, every subject in the cohort enters at the same

    predefined zero-time. The common zero time is defined

    according to the cohort time axis, which may be calendar

    time, time from a particular event or onset of a disease, or

    time from attainment of a certain age.

    Subjects are followed-up until their particular exit time (which

    may be defined by occurrence of a disease or event of

    interest, calendar time, age, death, or loss to follow-up).

    33

  • (2) Case definition and selection A case is defined by the occurrence of a particular event

    or outcome. The case only becomes a case at the time

    (with respect to the cohort time axis) of such occurrence.

    A person may be selected as a control (for another case)

    before becoming a case.

    34

  • (3) Risk set definition and formation for all cases

    The process of definition and selection of controls is

    more complicated than for the cases. This process

    begins with the definition of a risk set for each case, from

    which the controls will be selected. The risk set consists

    of all noncases (which are therefore considered “at risk”

    of becoming a case) present in the cohort at the time the

    case becomes a case.

    35

  • (3) Risk set definition and formation for all cases

    36

    某個案在罹病前可以是某case的control

    某個案可以是2個以上的case的control

    (Vidal Essebag, 2003)

  • 37

    (Bendix Carstensen, 2011)

  • (3) Risk set definition and formation for all cases

    One can add criteria to the risk set definition so as to match the risk set (and therefore the controls) to the case on other potential confounding factors (eg, calendar time, age, sex, or certain exposures).

    One must realize that matching for a confounding factor means that it will not be possible to evaluate its effect in the analysis.

    In some cases it is preferable to adjust for a confounding factor in the analysis rather than to match for it in the design.

    38

    (Vidal Essebag, 2003)

  • (3) Risk set definition and formation for all cases

    There is a risk of “over-matching” (ie, losing the power to

    detect a difference in odds of exposure because of

    correlation of the exposure with the variable matched

    for), particularly when inappropriately matching for a

    variable that is in the causal pathway between the

    exposure and outcome.

    39

    (Vidal Essebag, 2003)

  • (4) Random selection of controls from each risk set

    Once the risk set is defined, controls for each case are randomly selected from the risk set corresponding to the case. Usually, one decides on a number (eg, 4) of controls per case, and randomly selects the same number of controls for each case.

    Exclusion of future cases (ie, a noncase member of the risk set that later becomes a case) as controls also leads to biased estimates and should be avoided. (eg, open cohort)

    Exposure and covariate (ie, potential confounding variables) information for each control selected from the case’s risk set must reflect values at the time of selection. This is particularly important for exposures and covariates that vary with time. 40

  • 重疊病例對照研究法使用時機

  • 1. Savings in cost and time

    42

    This design has become popular because it allows for

    statistically efficient analysis of data from a cohort with

    substantial savings in cost and time.

    適用於危險因子暴露資料非常昂貴、複雜時,如果在一個

    規模較大的世代研究中,分析整個世代全部的暴露資料經

    費高昂且難以達成時。

    (Vidal Essebag, 2005)

  • 2. Rare outcomes The superior computational efficiency by nested case-

    control study design may be particularly useful in case

    rare outcomes are encountered.

    43

  • 3. When studying exposures that vary with time

    When studying exposures that vary with time, an additional

    level of complexity is introduced by the need to account for

    time-dependent exposure in both the design and analysis. This

    can be accomplished by including time-dependent covariates

    in a Cox proportional-hazards regression model. Alternatively,

    a nested case-control approach can be used provided that the

    exposure and covariate information for controls reflects values

    corresponding to the time of selection of their respective case.

    44

    (Essebag V, 2005)

  • 4. Computational efficiencies The nested case-control approach obviates the

    computationally intensive calculations involved in Cox

    regression when time-dependent covariates are used.

    45

    (Essebag V, 2005)

  • Case-control study vs.

    nested case-control study

  • 病例對照研究的問題!! The underlying assumption of a case-control study is that cases

    and controls are random samples selected from the same source population.

    The challenge is to select a representative sample of controls. If the study base were well defined, a random sample of controls

    would be selected. However, if the study base is not well defined, controls are often selected from hospitals, clinics, workplaces, or neighborhoods of cases. Selection bias is introduced when the controls are not representative of the study base.

    This weakness is not an issue in cohort studies or in nested case control studies. 47

  • Nested case-control study的優點

    A major flaw inherent to case-control studies is the difficulty to

    ensure that cases and controls are a representative sample of

    the same source population.

    The traditional case-control study’s study base can be difficult

    to precisely define, and information about the entire

    population is not available.

    48

    (Cornelis J Biesheuvel, 2008)

  • Nested case-control study的優點

    In a nested case-control study the cases emerge from a well-

    defined source population and the controls are sampled from

    that same population.

    The nested case-control design differs from the traditional

    case-control design in that it is “nested” in a well-defined

    cohort, for which information on all members can be obtained.

    This design makes it easier to satisfy the assumption (of the

    case-control study) that cases and controls represent random

    samples of the same study base.49

  • Cohort study vs.

    Nested case-control study

  • 世代研究的問題!! The cohort study is the much larger sample size required

    if the outcome is rare.

    51

    (Vidal Essebag, 2003)

  • Use nested case-control study inpharmacoepidemiology

    Understanding and Avoiding Immortal-Time Bias in Gastrointestinal Observational Research

    Laura E. Targownik and Samy SuissaAm J Gastroenterol 2015; 110:1647–1650

  • Pharmacoepidemiology Pharmacoepidemiology is the field of research

    that uses observational research methods to

    detect associations between use of medications

    and the occurrence of outcomes of interest.

    53

  • Pharmacoepidemiology Pharmacoepidemiologic analyses are interrogated to

    evaluate relationships between patterns of drug use and

    both beneficial and adverse outcomes.

    However, the results of many of these analyses may be

    corrupted by the presence of immortal person-time bias, an

    analytic error which can result in an overestimation of the

    benefits of medical therapy.

    54

    (Laura E. Targownik, 2015)

  • Example

    An observational cohort study with the aim of determining whether drug D reduces the

    risk of requiring surgery among persons with inflammatory bowel disease (IBD)

    (Laura E. Targownik, 2015)

  • What is immortal time ? For the survival analysis, data are available for all persons

    in the study from their date of IBD diagnosis until either they undergo IBD-related surgery, or they are censored due to their being lost to follow-up or because they survived outcome-free until the end of data availability.

    The person-time that is accumulated between the date of diagnosis and the initial exposure to drug D is considered to be immortal, because in order to receive the drug during the follow-up period, IBD surgery could not have previously occurred.

    Immortal-time bias is introduced when this follow-up time between the time of diagnosis and their first receipt of drug D is either misattributed as exposure time or ignored completely.

    56

  • Why is immortal person-time important ?

    Failure to recognize immortal person-time

    generally will cause an illusory strengthening of

    the protective effect of a medication class.

    57

  • Figure . ( a ) Illustration of biased incidence rate ratio due to inappropriate exclusion of immortal person-time. ( b ) Illustration of biased incidence rate ratio due to inappropriate assignment of immortal person-time to the exposed group.

    Increasing the incidence of the unexposed group

    Lowers the incidence rate among the exposed

    58

  • Figure . ( c ) Illustration of correct incidence rate ratio due to proper attribution of immortal person-time.

    59

  • When should we be suspicious of immortal-time bias ?

  • When should we be suspicious of immortal-time bias ?

    First, this bias occurs exclusively with survival analyses

    within cohort studies. Therefore, immortal-time biases

    are much less likely to occur within nested case– control

    studies.

    Second, the bias can arise when exposure status is

    determined at a point after the initial start of follow-up.

    61

  • When should we be suspicious of immortal-time bias ?

    Third, the paper should explicitly state that the exposure

    was interpreted as a time-dependent covariate in any

    survival analysis modeling, such as a Cox proportional

    hazards model.

    Finally, if the result is “too good to be true”, then it is

    probably not true and immortal-time bias should be

    suspected.

    62

  • 63

    Thank you!

    63

  • 要選幾個control ?

  • 要選幾個control The statistical efficiency of the nested case-control

    approach for cohort analysis depends on the number of

    controls per case selected.

    The expected decrease in the SD of the parameter

    estimates as the number of controls per case increases.

    This decrease in variance is explained by the fact that as

    the number of controls per case increases.

    65

    (Essebag V, 2005)

  • 要選幾個control

    66

    Cohort study, HR=2.03

    (Essebag V, 2005)

  • 要選幾個control ? The use of 4 controls per case provides a relative

    statistical efficiency of 0.8 compared to the use of an infinite number of controls.

    However, the relative efficiency also depends on the probability of exposure among the controls and on the magnitude of the estimated relative risk.

    Gains in statistical efficiency are possible by using greater than 4 controls per case particularly when the probability of exposure among the controls is

  • 要選幾個control

    68

    (D Pang, 1999)

  • Control 的選擇

    1. Random selection2. Matching3. Counter-matching

    69

  • 70

    (John B Cologne, 2004)

  • Counter-matching A stratified nested case-control sampling method

    71