mikael kubista mikel.kubista@tataa - reference in qpcr www ... · mikel.kubista@tataa.com....

Post on 14-Aug-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Real-time PCR Expression Profiling

Mikael Kubista

Mikel.kubista@tataa.com

Expression profiling

Genes/samples that behave similarly are identified by

their expression patterns

Input data:

1)Expression of genes (2 or more)

2) In many samples (2 or more)

3)As function of time, drug load, genetic make up etc

Multivariate data

Multiway data

B cell lymphoma

The immunoglobulin

light chain constant

region has two versions,

and , that are

expressed in 60 and 40

% of B cells in healthy

individuals.

In Non Hodkin

lymphoma the 60:40

expression ratio is

altered due to clonality

Constant

region

Variable

regions

Kappa & lambda raw data

Classification in scatter plot

14 16 18 20 22 24 26 28 30 32 34 36

12

14

16

18

20

22

24

26

28

30

32

34

36

Num

ber

of Ig

L c

DN

A a

mplif

ications

Number of IgL cDNA amplifications

107

106

105

104

103

102

107

106

105

104

103

102

Num

ber o

f IgL c

DN

A

Number of IgL cDNA

Positive

Positive

Ståhlberg et al., Clin. Chem. 49, 51-59 (2003).

Yeast metabolism

Experimental design

• Four strains of yeast: Wt, Hxt7, Tm6 and Null

• Expression over time after glucose addition: 0 – 60 min

• Expression of genes:

• Genes:

– Ref: BACT, IPPI, PDA

– Glucolysis: TPI, PGK, PDC, ADH1

– Glycogenesis: FBP, MDH2, SUC2, ADH2

– Unknown: ADH3, ADH4, ADH5, ADH6

– HSP, CYC, MIG

Data pre-treatment

1. Correct for off-scale measurements (primer-dimers)

2. Compensate for variations between runs (inter-plate

calibration)

3. Assay efficiency correction

4. Normalize with spike (efficiency variation between samples)

5. Normalize to the same amount of sample

6. Average QPCR technical repeats

7. Normalize with reference genes

8. Average technical repeats

9. Normalize with reference samples (paired test)

10.Calculate relative quantities

11.Convert data to log scale (fold changes)

12.Mean-center/autoscale

Wild type – temporal responses

ADH3-6

Induced

genes

Repressed

genes

HSP12

CYC1

WT – Autoscaled temporal responses

Group 1

Gene1

Gene2

Gene3

Gene2

Gene1

Gene3

The regression line (least squares fit to all points)

is the direction of greatest variance. It defines the

first Principal Component (PC1)

PC1 = C11×Gene1 + C12×Gene2 + C13×Gene3

C11 is the importance of Gene1

in defining PC1 (“loading”)

The samples in the new space

are described by the distances

from the center of PC1 (“score”)

Sample =

Sample(Gene1,Gene2,Gene3)

Gene2

Gene1

Gene3

Often PC1 is not sufficient to describe the data.

PC2 is defined as the vector perpendicular to PC1

that accounts for most of the remaining variance

PC2 = C21×Gene1 + C22×Gene2 + C23×Gene3

One can go on calculating as

many PCs as there are genes

genes. But its not meaningful.

2 or 3 PCs are usually

sufficient to account for the

information in the data, and

such low dimensionality

space is readily visualized

First three score vectors in PCA

PC1 vs. PC2 scores plot (WT)

PC1 vs. PC2 vs. PC3 scores plot (WT)

HXT - HXT7 mutant

HXT – TM6*

HXT - null

Matrix augmentation

Identifying stable reference gene candidates

Optimum number of reference genes

Bias (intergroup variability)

WT TM6 HXT7 Null

TPI -0.8949 0.018 -0.7199 1.5968

SUC 1.152 -0.9102 0.952 -1.1938

PGK -1.8543 -0.1664 -0.3793 2.3999

PDC -1.0355 -0.2102 -0.7105 1.9562

PDA -0.1121 0.0195 0.1379 -0.0454

MIG 0.7051 -0.9445 0.1301 0.1093

MDH 1.6489 0.8617 0.5614 -3.072

IPPI -0.1418 0.1836 0.0582 -0.1001

HSP 0.0614 0.3367 -0.1136 -0.2845

FBP 1.6707 0.6336 0.1207 -2.4251

CYC -0.1418 -0.2664 0.1332 0.2749

ACT1 -0.3199 0.0555 0.0301 0.2343

ADH6 -0.248 -0.2977 -0.073 0.6187

ADH5 -0.6824 0.3305 0.2676 0.0843

ADH4 -0.2761 -0.1758 -0.0636 0.5155

ADH3 0.1364 0.2617 0.1989 -0.597

ADH2 1.0114 0.4367 0.4364 -1.8845

ADH1 -0.6793 -0.1664 -0.9668 1.8124

OK

OK

OK

Augmented PC1 vs. PC2 scores plot

Color = presumed function

Symbol = strain

Augmented hierarchical clustering

Forced Self Organized Map

FBP1, ADH2,

MDH2 (all

strains), SUC2

(HXT7) HSP12

(HXT7 and TM6*),

ADH3 (WT and

HXT7) and ADH5

(WT)

SUC2 (TM6*), HSP12

(WT) and ADH4 (HXT7)

ADH1, PDC1,

TPI1 (all

strains), MIG1

(WT and HXT7),

PGK1 (WT),

ADH4 (TM6*)

CYC1 (all

strains), SUC2

(WT), ADH5

(HXT7 and

TM6*), ADH3

(TM6*)

PGK1 (TM6*)

MIG1 (HXT7),

PGK1 (HXT7),

ADH4 (WT),

ADH6 (all

strains)

Behavior of the genes in three strains

Acknowledgements

Gothenburg University

Anders Ståhlberg

Karin Elbing

Lawrence Livermore Laboratory

Björn Sjögreen

Institute of Biotechnology

Radek Sindelka

Vlasta Ctrnacta

David Svec

Vendula Rusnakova

Razi University

Jahan Ghasemi

A Coruna University

José Manuel Andrade

Ales Tichopad

TATAA Molecular Diagnostics

Ales Tichopad

MultiD Analyses

Amin Forootan

Daniel Lindh

Anders Bergkvist

Symposium in Prague

Developments in Gene Expression

Profiling

May 25-28, 2009

(www.qpcrsymposium.eu)

top related