comment optimiser la recherche en réanimation comment jinterprète les résultats statistiques?...

Comment optimiser la recherche en réanimation

Comment j’interprète les résultats statistiques?

Jean-François TIMSIT MD PhDMedical ICUOutcome of cancers and critical illnessesUniversity hospital A MichallonINSERM U 823 Grenoble FRANCE

Biais de toutes les étudesBiais de toutes les études

Biais de sélection: échantillon trop différent de la population

cible, ou si la manière de sélectionner les patients à inclure ne

permet pas d’espérer obtenir un population cliniquement

représentative

Biais d’information : les facteurs de risques et les critères

de jugement ne sont pas recueillis correctement (pas d’HC

=pas de septicémie..)

Biais de confusion: variable (évènement) qui contribue à la fois

au critère de jugement et aux facteurs de risque.

Regardez bien vos (les) données+++

Regardez bien vos (les) données+++

• 90% de l’énergie nécessaire pour tirer des conclusions…– Distribution des variables– Outliers– Reproductibilité– Valeurs manquantes– Correlation entre les variables data reduction

Stroke 1999;30:1402-1408

- 38.7 à 77.7

Data structure

N Engl J Med 2002;549:556

Analyze the data structureAnalyze the data structure

Lancet 2001;357:9-14

External validity

Demonstrate that the patients you enroled are the ones of interest??

• Mortality of the control group

0

10

20

30

40

50

60

70

EG

DT

CS2

000

Cor

ticu

s

CA

TS

Kyb

erse

ptA

nti E

5

TF

PIL

ener

sept

Pro

wes

s

Septic shockSevere sepsis

Prowess

• 1690 pts/ 11 countries/ 164 sites!!!!• A very few % of the severe sepsis admitted• The overal treatment are not standardized…• External validity..?

– More pragmatic studies enrolling all the patients with severe sepsis….

– But…there was a learning curve!!

« CONCLUSIONS: A learning curve appeared to be present within the PROWESS trial … efficacy improved with increasing site experience... Investigational sites may need to require a minimum level of protocol-specific experience to appropriately implement a given trial. …This experience should be an important consideration in designing trials and analysis plans. … »

Macias et al – Crit care med 2004;32:2385Macias et al – Crit care med 2004;32:2385

The control group…« is an exagerate real life »

Finney – JAMA 2003

Control group in theVandenberghe study (2006)

Why should we wary of single-center trials?

Bellomo et al – Crit Care Med 2009; 37:3114-19• Bcp ont été contredites par des études multicentriques

– Prone position (Drakulovic 1999) vsVan Neiwenhoven 2006– Van den Berghe vs Nice-Sugar… – EGDT

• Importance de l’effet– EGDT DC 46.9% 30.5% (RRR=35%!!!)

• Validité externe– Population particulière– Critère de jugement « maison »– Mode de prise en charge globale

• Stable (variabilité) mais mal décrit ou non standart?• Nécessitent une charge de travail particulière (dévouement à l’étude)

Registries for rubust evidence+++ Dreyer et al – JAMA 2009; 302:790-1

• Permettent de valider les résultats des RCT dans la « vraie vie »

• Permettent de générer des hypothèses pour des études complémentaires

Le critère de jugementLe critère de jugement

Précis Reproductible Reflet de ce que vous voulez mesurer++

..attention à ce choix+++

« Surrogate end-points »• Closely linked to clinical end-point?

Surrogate <-> clinical end-point• Good calibration of the surrogate end point and

more sensitive to change

– Caution!!!.

Bucher HG – JAMA 1999; 282:771

Surrogate end-points…example of failure

• Blood pressure DC

• LNMA BP

Lopez A et al – Crit Care Med 2004;32:21-30

Estimated rate of nosocomial pneumonia?

• The real rate of NP is 20%

• The rate of misclassification vary according to the accuracy of the diagnosis

True VAP

True non VAP

total

Diagnosed VAP

a c x

Diagnosed Non-VAP

b d y

Total a+b c+d Total

Se=a/(a+b)Sp=d/(c+d)a+c=xb+d=y

True rate vs estimated rate of an event

No VAP

VAP

T-

T+

80 20

Rate of VAP: 26%

No VAP

VAP

T- 80 0

T+ 0 20

80 20

Se=p[T+]/[D+]= 1Sp=p[T-]/[D-]= 1

Rate of VAP: 20%

72

18

No VAP VAP

T- 72 2

T+ 8 18

80 20

= 0.9 X 20

= 0.9 X 80

Se=p[T+]/[D+]= 90%Sp=p[T-]/[D-]= 90%

Estimated effect of a new treatment

Placebo Treatment

No CRI 950 975

CRI 50 25

1000 1000

Sp=p[T+]/[D+]= 100%Se=p[T-]/[D-]= 100%

True rate of CRI: 5%RR=2

« True » 0R=2.05, p=0.000045

What’s happen if the diagnostic test is not perfectly accurate?


Placebo Treatment

No CRI ? ?

CRI ? ?

1000 1000

Sp=p[T+]/[D+]= 90%Se=p[T-]/[D-]= 100%


=True CRI * Se + True no CRI*(1-Sp)=50*1 + 950*0.1=145!!!!

145

« True » 0R=2.05, p=0.000045


Placebo Treatment

No CRI 855 877

CRI 145 123

1000 1000

Sp=p[T+]/[D+]= 90%Se=p[T-]/[D-]= 100%


« True » 0R=0.49, p=0.000045

Estimated 0R=0.82,P value= 0.051


Placebo Treatment

No CRI 965 982

CRI 35 18

1000 1000

Sp=p[T+]/[D+]= 100%Se=p[T-]/[D-]= 70%


Estimated 0R=1.98,P value= 0,0006

=True CRI * Se + True no CRI*(1-Sp)=50*0.7 + 950*0=35

« True » 0R=2.05, p=0.000045

Measurement errors

• If the prevalence of the event is low, you need a very specific test to avoid measurement error of the treatment effect

• If the prevalence is high, you need a very sensitive one….

What is the optimal clinical end-point?

Underlying illnesses

Acute disease

timeDay 14 Day 28 Day 90 1y

What is the best???

• Day 14 more related to the disease itself…low noise (death due to other cause)

• Day 28 compromise?• Day 90 competing events?, probably more

important at the patient’s point of view• 1 year competing events, more important for

patient and at the societal point of view• All of the end-points YES!!BUT

Multiple comparisons ( NNT, power)

« Survival analyses? »

(Type I error (%))

1- (Power (%))

Number of tests

Genetic profiles

• > 1000 signals for bacterias• > 100 000 signals for humans

Decrease of power and increase in the type I error

Signal 1 Signal 2..Pat 1Pat 2Pat 3Pat 4Pat 5Pat 6Pat 7Pat 8Pat 9Pat 10Pat 11Pat 12……..

Signal 1 Signal 2 Signal 3 Signal 4 signal 5 Signal 6 Signal 7…

Pat 1Pat 2Pat 3Pat 4Pat 5

Mondial consortium, external validation

Time pitfalls

• Time to measurement of exposure

• Competing events

NIV failure has not been measured at the beginning of the follow up (time dependent event)

JAMA 2000

NIV success

NIV failure

Invasive ventilation

1,0

0,2

0,4

0,6

0,8

0 4 8 12 16 20 24

1,0

0,2

0,4

0,6

0,8

0 4 8 12 16 20 24

Cu

mul

ativ

e pr

opor

tion

O

f pa

tien

ts w

itho

ut p

enum

onia

days

Risque compétitif= censure informativeRisque compétitif= censure informative

temps de survenue du décès (analyse de survie)

tous les modèles pour données censurées considèrent que la censure n’est pas informative

« un individu i qui est censuré au temps t estexposé au même risque de décès au temps t+1 qu’un autre patient encore exposé au risque »

Cette hypothèse forte est fréquemment fausse, surtout en réanimation ou le délai de survenue de la sortie vivant et le délai de survenue du décès sont complètement liés

La sortie de réanimation est un risque compétitif

mortalité à date fixe plutôt que mortalité ICU++++

Randomization…what for?

Well done multivariate analysis is able to adjust

on known confonders

Random allocation is the only way to equilibrate groups on confounding factors..known AND

UNKNOWN +++

Treatment ATreatment

B

DC 5% DC 40%

SAPSII 32 SAPS II 40

Genetic Fact X 90%

Genetic Fact X 10%

RCT: le dogme

• Principes de base1 avez vous atteint vos objectifs concernant la puissance

statistique de votre étude?2 Avez vous analysé tous les patients inclus?3 Avez vous limité l’analyse au seul critère de jugement

principal?

Dans une étude randomisée contrôlée, si tous les objectifs sont atteints un test statistique suffit et aucune comparaison entre les populations n’est nécessaire

But…• In practice not really applicable

– Intermediate analysis should lead to early and more ethical studies (LnMMA, HCG)

– It should be more appropriate to analyze data about patients that were effectively treated or with a confirmation of the disease there have been hypothesized at inclusion

Ex:• Severe sepsis definition needs the occurrence of an infection proven or

suspected…• Gram negative septicemia need to be immediately treated before the

results of the BC

– At least 2 judgment criteria: efficacy and side effects…• But inflation of type I and II errors (acceptable if a priori designed)

In practice

• Exclusion is possible if exclusion criteria has been obtained before randomization (even the results are not available) at random if planned in the original protocol

• Exclusion criteria should not depend of the attending physician expertise

• One primary end-point and previously designed secondary end-points

• As final groups are not fully decided at random, group comparability is needed.

A CONFOUNDER…

• A confounder is associated with the risk factor and causally related to the outcome

Carrying matches Lung cancer

Smoking

In ICU

• Many intercurrent events

• Many interactions between events

• DNR orders++

Crit Care Med 2008

3611 patients included,1415 (39.2%) experienced one or more AEs821 (22.7%) had two or more AEsMean number of AEs per patient was 2.8 (range, 1–26).

Six AEs were associated with death:primary or catheter-related BSI OR 2.9;95% CI, 1.6 –5.32BSI from other sources OR, 5.7; 95% CI 2.66 –12.05nonbacteremic pneumonia OR, 1.7; 95% CI 1.17–2.44deep and organ/space SSI without BSI OR, 3.0; 95% CI, 1.3– 6.8pneumothorax OR, 3.1; 95% CI, 1.5– 6.3gastrointestinal bleeding OR, 2.6; 95% CI, 1.4–4.9

Adjustement using a magic « multivariate model »

x

y z

Truth universe in your sample


x

y z


x

y z

Model using interactions and polynomes…

Validation using external samples

x

y z

Other representative sample of the truth universe

Messages

• As many possible models as individuals (even more!!)

• Parcimony decreases model discrimination but improves external validity

the statistical analyses should be precisely designed a priori

Primary and secondary analyses should be precisely planned

Rules for multivariate models

• Select the model according to the end point• Check for its hypotheses• The explanatory variables should be

– Precisely defined– Not related one to another– Sufficiently frequent in both groups (problem with perfect

or quasi perfect discrimination)• Ex: Multiple logistic regression in CCM (2006-2007)

(Poster 0524 – P Lambrecht and D Benoit – Ghent, Belgium)– Median 6 shortcomings by multiple logistic regression– (significantly decreased when a statistician is a co-author)

How I interpret the result?

Discussion with a statistician if you are not familiar with statistics

What is the title of the paper you want to do?Subgroup analyses lead to a important

increase in the type I error and also in a decrease of the power of your study

-exploratory analyses that should be confirmed

Interpréter les résultats avec une certaine distance…

comment optimiser la recherche en réanimation comment jinterprète les résultats statistiques?...

Documents