digital pcr for copy number analysis - qbase+ · 2016-09-13 · digital pcr for copy number...

Digital PCR for copy number analysis

Jo Vandesompele, PhD

Biogazelle CSO, UGent professor

EMBL Advanced Course Digital PCR, Heidelberg, Germany

October 22, 2015

Acknowledgements (A-Z)

Lieven Clement, Els Goetghebeur, Bart Jacobs, Peter Pipelers, Olivier Thas, Matthijs Vynck

Steve Lefever, Björn Menten, Katrien Vanderheyden, Kimberly Verniers, NurtenYigit

Ariane De Ganck, Nele Nijs

Xavier Alba, Jen Berman, Frank Bizouarn, Viresh Pattel, Svilen Tzonev

Agenda

• introduction

• experiment design

• power analysis

• sensitivity vs. inhibition vs. availability of input

• CNV use cases

• advanced data-analysis

• droplet classification

• combining replicates & multigene normalization

• tips & tricks

Full text papers available on Biogazelle website

http://www.biogazelle.com > Knowledge center > publications

Biogazelle blog on dPCR vs. qPCR

http://www.biogazelle.com/knowledge-center/blog

Digital PCR is emerging as gold standard method for CNV

• Biogazelle is reference lab for Bio-Rad’s QX100/200 droplet digital PCR technology

• Scalable precision and relative sensitivity (needle in the haystack) (“more is better”)

• High accuracy (without calibration)

• Excels in quantification of small differences and rare events

Application domains

• in principle any nucleic acid quantification study (cost/throughput)

• focus on those areas where dPCR excels

• small differences

• CNV analysis (high copy number range, transgene stability testing, cell-free DNA (NIPT, oncogene amplification)

• gene expression (microRNA, splice variants)

• rare events

• pathogens (e.g. viral load in body fluid such as urine)

• mutant cancer cells (tissue, circulating cells or cell-free DNA)

• circulating RNA biomarker (cell-free RNA)

dMIQE guidelines for digital PCR

• Clinical Chemistry, 2013

• co-authored by Biogazelle founders

dMIQE guidelines have 3 goals

1. Design, perform, and report dPCR experiments that have greater scientific integrity

2. Facilitate replication of published experiments adhering to the guidelines

3. Provide critical information that allows reviewers and editors to assess the technical quality of manuscripts

Power analysis is a crucial aspect of experiment design

• Ensure proper setup to find a true difference with statistical significance

• Often ignored

• Limitations of dPCR power analysis in literature

• no or few details on the methods

• no incorporation of replicate variability (instead, reactions are (naively) pooled over replicates)

• not taking into account of all variables (e.g. replicates, fraction of negative droplets, …)

• use of meta-analysis methods (instead of ad hoc statistical method)

Digital PCR power analysis is a function of

• true difference you want to see

• number of partitions

• fraction of negative partitions

• number of replicates

• alpha value (type I error, false positive rate, 5%)

• 97% power to detect a 10% difference in copy number using 3 replicated reactions of each 14,000 partitions with 30% negative partitions

• 53% for a 5% difference

Interactive tool to determine power in digital PCR experiments

• power for a given condition

• power ~ number of replicates

~ fraction of negative partitions

~ number of partitions

~ copy number difference

• optimal negative fraction (for max power) ~ copy number difference

• Vynck et al., in preparation

http://vandesompelelab.ugent.be/power/

Power in function of fraction of negative partitions


• difference of 10%• 14,000 partitions• 3 replicates

Power in function of number of replicates


• difference of 10%• 14,000 partitions• 95% negatives

Power in function of number of partitions


• difference of 15%• 1 replicate• 30% negatives

What is determining the sensitivity of dPCR?• Both qPCR and dPCR can detect 1 molecule (precision is higher

for dPCR at low concentrations)

• Input amount of nucleic acids

• more cDNA to detect a low abundant transcript (e.g. long non-coding RNA)

• more circulating cell-free DNA to detect a low frequent mutation

intended&sensitivity ng&of&DNA&needed10.000% 0.2291.000% 2.2860.100% 22.8570.010% 228.5710.001% 2285.714

assuming at least 5 positive droplets are needed for confident calling, a perfectly discriminating assay between wild type and mutant, 14,000 recovered droplets from 20,000 formed

Large dynamic range, high precision and accuracy

• Correlation between expected and measured concentrations on a gDNA dilution series (ranging from 100 000 copies/reaction to 5 copies/reaction) (320 ng – 16 pg DNA)

y = 0.9781x + 0.0695R² = 0.99877

0

1

2

3

4

5

6

0 1 2 3 4 5 6

log

10 (

me

asu

red

co

nc

en

tra

tio

n)

co

pie

s/d

dPC

R re

ac

tio

n

log10 (expected concentration)copies/ddPCR reaction

Unpurified digested genomic DNA inhibits ddPCR if > 30 v/v%

y = 1.143x + 3.224R² = 0.990

3.0

3.5

4.0

4.5

5.0

5.5

0.6 0.8 1.0 1.2 1.4 1.6 1.8

log

10 (

me

asu

red

co

nc

en

tra

tio

n)

co

pie

s/re

ac

tio

n

log10 (gDNA concentration)v/v%

25

57.5

1015

2030

cDNA inhibits ddPCR if > 25 v/v%

• Influence of cDNA input amounts (ranging from 5 to 45 v/v%) on measured concentration

y = 0.921x + 3.306R² = 0.999

0.0

1.0

2.0

3.0

4.0

5.0

6.0

0.6 0.8 1.0 1.2 1.4 1.6 1.8

log

10 (

me

asu

red

co

nc

en

tra

tio

n)

co

pie

s/re

ac

tio

n

log10 (cDNA concentration)v/v%

510

15 20 25

Case 1 – genetic characterization of cell banks

• Therapeutic protein production in biopharmaceutical industry

• Transgene copy number has influence on expression level

• Need for a cell line that is genetically stable throughout the biopharmaceutical manufacturing process

• Genetic characterization of Master Cell Bank (MCB) and Working Cell Bank (WCB)

• Traditionally by Southern blot analysis -laborious and time consuming

• > qPCR method for transgene copy number determination

Case 1 – struggling with qPCR

• Transgene copy number analysis

• Limited accuracy at higher copy numbers

• Compensated by including more PCR replicates and calibrators

(D’haene et al., Methods, 2010)

• Pilot study: synthetic CN series (1-10 copies) measured with 16 qPCR replicates

• Resampling to investigate impact of increased number of replicates & calibrator samples

• Conclusion

• 8 qPCR replicates and 3 calibrator samples are required for CN analysis at increased copy numbers

• Still relatively large deviation from expected copy number in proof of concept study

S1 S2 S3 S4 S5 S6 S7 S8

Case 1 – proof of concept 1

• Copy numbers from duplex assay – gene 1 (performed in triplicate)

• observed normalized copy numbers tightly agree with expected integer copies

expected CN: 0 0 1 2 3 4 5 5

Co

py n

um

be

r

Case 1 – proof of concept 2

• Copy numbers from duplex assay – gene 2 (performed in triplicate)

• deviation from expected integer copies for samples 3 and 4

S1 S2 S3 S4 S5 S6 S7

expected CN: 1 1 4 4 3 0 1

Co

py n

um

be

r

Case 1 – getting integer copy numbers with ddPCR

• Copy numbers from duplex assay – gene 2 (XbaI restriction digest)

• Restriction digest is required to properly count linked loci (here: tandem repeats)

S1 S2 S3 S4 S5 S6 S7

expected CN: 1 1 4 4 3 0 1

Co

py n

um

be

r Restriction digest

Case 1 - ddPCR versus qPCR

• ddPCR has higher accuracy than qPCR

• 3.1 x lower standard deviation on log2 copy numbers

• 2.3 x smaller fold changes between max and min copy number

• Less reactions required for ddPCR than for qPCR

• ddCPR requires no external standard or calibrator sample with known copy number

0.00#

1.00#

2.00#

3.00#

4.00#

0.00# 1.00# 2.00# 3.00# 4.00# 5.00#ddPCR

%

qPCR%

M4%gene%copy%number%

qPCR

dd

PC

R

Case 1 – ddPCR based genetic characterization of cell banks

• Copy number

• 24 samples – WCB

• Duplex assay – gene 1

• Expected CN: 5

• Deviation from expected CN

• Average: 0.11

• Standard deviation: 0.078

Co

py

num

be

r

01_W

CB

02_W

CB

03_W

CB

04_W

CB

05_W

CB

06_W

CB

07_W

CB

08_W

CB

09_W

CB

10_W

CB

11_W

CB

12_W

CB

13_W

CB

14_W

CB

15_W

CB

16_W

CB

17_W

CB

18_W

CB

19_W

CB

20_W

CB

21_W

CB

22_W

CB

23_W

CB

24_W

CB 0

0.05

0.1

0.15

0.2

0.25

0.3

01_W

CB

02

_WC

B

03_W

CB

04

_WC

B

05_W

CB

06

_WC

B

07_W

CB

08

_WC

B

09_W

CB

10

_WC

B

11_W

CB

12

_WC

B

13_W

CB

14

_WC

B

15_W

CB

16

_WC

B

17_W

CB

18

_WC

B

19_W

CB

20

_WC

B

21_W

CB

22

_WC

B

23_W

CB

24

_WC

B

01_W

CB

02_W

CB

03_W

CB

04_W

CB

05_W

CB

06_W

CB

07_W

CB

08_W

CB

09_W

CB

10_W

CB

11_W

CB

12_W

CB

13_W

CB

14_W

CB

15_W

CB

16_W

CB

17_W

CB

18_W

CB

19_W

CB

20_W

CB

21_W

CB

22_W

CB

23_W

CB

24_W

CB

De

via

tio

n

Case 1 – ddPCR based genetic characterization of cell banks

• ddPCR is very well suited for transgene copy number determination

• Genetic characterization of cell banks for therapeutic protein production

• Transgene copy number analysis in genetically modified (GM) crop research

• Transgenic animal models

• Remark: qPCR is the standard approach in biopharmaceutical industry – will take some time to adopt ddPCR

Case 2 – clinical genetics application

• Detection of chromosomal aneuploidies

• Proof of concept on post-natal samples

• Future: non-invasive prenatal testing (NIPT)

• Challenge to achieve accuracy and precision required to quantify fetal copy numbers in prenatal samples based on low level fetal cfDNA in maternal blood (median amount of 10%)

Case 2 – assay design and validation

• Design of assays for a number of loci on chromosomes for which copy number variations are most often found

• Chromosome 21 (e.g. trisomy 21 or Down syndrome)

• Chromosome 13 (e.g. trisomy 13 or Patau syndrome)

• Chromosome 18 (e.g. trisomy 18 or Edwards syndrome)

• Chromosome X & Y (e.g. Turner syndrome)

• Empirical validation using qPCR

• Standard curve (dilution series) à efficiency QC

• Gel electrophoresis à specificity QC

Case 2 – assay design and validation

• Design of assays for a number of loci on chromosomes for which copy number variations are most often found

• Chromosome 21 (e.g. trisomy 21 or Down syndrome)

• Chromosome 13 (e.g. trisomy 13 or Patau syndrome)

• Chromosome 18 (e.g. trisomy 18 or Edwards syndrome)

• Chromosome X & Y (e.g. Turner syndrome)

• ddPCR

• Chromosome specific assays (hydrolysis probe - FAM)

• Reference assay (RPP30 – VIC)

• Gradient PCR à standard protocol is suitable

• gDNA dilution series

• CNV duplex – 3 replicates

Case 2 – copy numbers of control samples

Control 1 Control 2

Control 3 Control 4

female

male

female

male

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

2.5

2

1.5

1

0.5

0

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

2.5

2

1.5

1

0.5

0

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

2.5

2

1.5

1

0.5

0

2.5

2

1.5

1

0.5

0

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

Case 2 – copy numbers of cases

Case 5

Case 9

Case 18

femaleTurner

trisomy 21

male

trisomy 18

male

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

C-2

1q

2.5

2

1.5

1

0.5

0

3.5

3

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

C-2

1q

2.5

2

1.5

1

0.5

0

3.5

3

2.5

2

1.5

1

0.5

0

3.5

3

A-1

3q

B-1

3q

A-1

8p

A-1

8q

B-1

8q

A-2

1q

B-2

1q

A-X

p

A-X

q

B-X

q

A-Y

p

B-Y

p

C-2

1q

Case 2 – proof of concept on post-natal samples

• ddPCR is great for copy number analysis in majority of samples

• Non-integer copy numbers may be observed in difficult samples

• Accuracy and precision need improvements to allow for NIPT

• ultrashort amplicons

• improved cell-free DNA isolation method (300-1000 alleles from 2 ml of plasma)

• multigene normalization (also for gene expression!)

Case 2 – optimization experiment design

• Standard CNV protocol – duplex normalization

• Triplicate ddPCR reactions

• 14 duplex reactions

• Each reaction contains one locus of interest (FAM) to be normalized with reference locus (VIC)

• Normalization against reference locus copy number in the same reaction


• Improved CNV protocol – multigene normalization

• Triplicate ddPCR reactions

• 7 duplex reactions

• Each reaction contains a FAM labeled assay and a HEX labeled assay (à HEX as alternative to VIC (Zen / Iowa Black double quencher probes from IDT)

• No a priori selection of reference gene locus

• Normalization against all other autosomal chromosomes with normal diploid copy number

geNorm - multigene normalization

• geNorm – cited more than 8000 times

Vandesompele et al., Genome Biology, 2002

Case 2 – multigene normalization

• Average deviation from integer copy numbers between different normalization strategies

deviation from integer CN

multigene normalization RPP30 normalization

Ca

se 5

Ca

se 6

Ca

se 9

Ca

se 1

6

Ca

se 1

9

Ca

se 2

0

Co

ntr

ol 4

Co

ntr

ol 3

Co

ntr

ol 2

Co

ntr

ol 1

0.000

0.010

0.020

0.030

0.040

0.050

0.060

0.070

0.080

multigene normalization

RPP30normalization

Average 0.015 0.037

SD 0.008 0.025


• Results show that normalization using other autosomes improves accuracy of copy numbers

• Normalization based on absolute autosomal counts reduces running cost by 50%

Advanced digital PCR data-analysis

• Vynck et al., submitted

• GLMM framework (R and Shiny web app)

• handles replicate wells

• multiple reference gene normalization

• automatic selection and application of stable reference genes

20 samples, 3 replicates each, ~ 14,000 droplets, negative fraction 80-90%, 95% CI

Results from oncogene detection in cell-free DNA from plasma

0"

0.5"

1"

1.5"

2"

2.5"

3"

3.5"

4"

4.5"

5"

W95" X1802" X2311" K578" R611" W17" X2323" X2545" Z198" S571" X2601" K585" S130" S494" X1314" X1562" X1659" X2597" S638" X1987"

2.0

3.0

1.0

• in 8/10, there was a perfect agreement on oncogene amplification status

• in 2/10, there is no agreement• fresh frozen is only marginally elevated (tumor

heterogeneity)• tumor DNA 2.068 (95% CI 2.017-2.121) > elevated• cfDNA 2.009 (95% CI 1.933-2.089) > normal

Comparison of plasma cfDNA and fresh frozen tumor DNA

More narrow CI with proper statistical processing of replicates

0.25%

0.5%

1%

2%

4%

8%

16%

32%

64%

128%

256%

512%

1024%

0.25% 0.5% 1% 2% 4% 8% 16% 32% 64% 128% 256% 512% 1024%

meta-analysis

GLM

M

More narrow CI with GLMM statistical processing of replicates

1"

2"

4"

1" 2" 4"

3:4 copies oncogene:reference gene (tumor)

cfDNA without evidence of oncogene amplification

cfDNA with signs of oncogene amplification (p<0.05)

meta-analysis

GLM

M

• Jacobs et al., BMC Bioinformatics, 2014

Partition misclassification has largest impact on accuracy and precision

Interactive tool to inspect sources of variance on absolute quantification

http://users.ugent.be/~bkjacobs/dPCR_VarComp/index.html

• Stochastic clustering approach that matches the intuition

• Using the raw data from the QX100

• Multistep approach• cluster center location (expectation maximization)• remove the rotation • univariate projection on each channel• robustly fit a normal null distribution on the negative peak• calculate the posterior probability to be negative with respect

to the channel for each droplet• combine both channels

Development of a framework for objective partition classification

Find cluster centers and remove rotation

Fit the null distribution and calculate posterior probability of the negatives

no rain rain

• red = fitted distribution of the negatives

• black = entire distribution

• probability negative droplet = red/black in the projected point of the droplet

Combine channels and label clusters based on max probability

Gene copy number quantification on digested high quality DNA

Inhibition due to cDNA carryover

Oncogene amplification in cfDNA

Single channel data for low concentration target

• better dealing with outlier droplets with lower than negative amplitude (deviating droplet volumes?)

• use combined estimated distribution of no template reactions instead of theoretical normal distribution

Work in progress

General conclusions (1)

• ddPCR is a great tool for copy number analysis

• no need for reference sample with known copy number

• better accuracy and precision compared to qPCR

• Points of attention

• restriction digest is required to quantify linked loci (e.g. tandem repeats)

• Remaining challenges

• non-integer copy numbers for difficult samples

• further improve accuracy and precision to meet NIPT requirements (for instance smaller amplicon size)

General conclusions (2)

• Power analysis is important (and easy)

• interactive tool

• Mathematical framework for combining replicates, selecting reference genes, and multigene normalization

• latent variable, complementary log-log link, GLMM

• Vynck et al., submitted

• Statistical framework for automated (objective) droplet classification

• Jacobs et al., work in progress

Tips & tricks

Template input (1)

• ~1 copy per droplet (CPD) (highest precision is at 1.59)

• range of 1-100 000 copies / 20 µl ddPCR reaction

• 0.00005 - 5 CPD

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

95

% c

onfi

den

ce in

terv

al f

ract

ion

fraction positive droplets

0.11 0.22 0.36 0.51 0.69 0.92 1.20 1.61 2.30

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

copies per droplet

1 well (20,000 droplets)3 wells merged

Template input (2)

• maximum 25 v/v% unpurified digested gDNA or undiluted cDNA to prevent inhibition (test using your own reagents)

• DNA digest is required for gene copy number analysis, especially for linked loci (not required for FFPE and cell-free DNA)

• integrity of DNA/RNA is as important for dPCR as for qPCR

• Vermeulen et al., Nucleic Acids Research, 2011

ddPCR assay design guidelines

• in house primerXL design pipeline

• primer3 based

• avoid SNPs (Lefever et al., Clinical Chemistry, 2013)

• avoid secondary structures (UNAFold)

• assess specificity (BiSearch / Bowtie)

• target: FAM-IBFQ, reference HEX-IBFQ

• amplicon length <70 nt if possible

• primer Tm: 61-63 °C

• probe Tm: 64-68 °C (65 opt)

• probe length: 14-25 nt (18 opt)

• HaeIII-compatible amplicons

Separation of + and - droplets dependon amplicon & probe length

• amplicons >100 bp, positive intensities drop

• rise in negatives as probe length increases (> 25 nt)

Gradient PCR allows selection of optimal annealing temperature

• gradient from 55-65 °C

• optimal Ta, specificity check

Duplex test validation

• same quantification result as in singleplex

• orthogonal droplet clusters in 2D plot

• orthogonality of duplex assay can be improved by

• Tm matching between target and reference assay

• Droplet PCR Supermix (#186-3023) (adding more resources)

digital pcr for copy number analysis - qbase+ · 2016-09-13 · digital pcr for copy number...

Documents