arrays against time transcriptomics ‘101’ wuhan 2011 ccc

44
Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Upload: leslie-todd

Post on 20-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Arrays against timeTranscriptomics

‘101’Wuhan 2011 CCC

Page 2: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

WT

Mut

ant

Ove

r ex

pres

sed

Oth

er s

peci

es

Transcription assay: Northerns

Extract targetRNA

YFG

Label probe+ hybridise

Nextgene

quantitate

Page 3: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

• Slow (Time consuming)• Hard (Technically challenging)

• 捡了芝麻丢了西瓜

Problems with Northerns:

Page 4: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Systems biology networks- We want to look at lots of transcripts:

Page 5: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

AtRegNet (gene regulation network)

Aracyc +other metabolomics data

Arabidopsis gene network (Ma et al. Genome Research 2007)

Arabidopsis

Page 6: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Merged Network Proteins (red) Metabolites (blue) & Genes (green)

19392 nodes and 72715 edges

捡了芝麻丢了西瓜

Page 7: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

WT

Mut

ant

Ove

r ex

pres

sed

Oth

er s

peci

es

Northerns – a few genes at a time.

Extract targetRNA

YFG

Label probe+ hybridise

Nextgenequantitate

Again and again and….

Page 8: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

• Sequencing ESTs (Déjà vu?)

• Differential display (random 5’ primers + fixed polyA primers)

Mass transcript profiling: Transcriptomics

• Microarrays

Page 9: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Probe preparation

Acquire or Generate probes

‘All the genes you want’

Label cDNA

from sample 1 RNA

…and sample 2 RNA

Target preparation

Extract RNA from yourControl AND your Experimental plant

Spot

Page 10: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Identify ‘spots’remove background

produce ‘red/green’ ratios

• Link ratio to relative abundance.• Link spot to gene. • Link genes to each other.

Hybridise & Scan

Page 11: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

ArraysHow do you make them ?

Page 12: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Arrayers

Page 13: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Pins

Pin type: blunt, ring, quill, coated…..Breaking: bending, stickingConsistency of spots: ‘coffee-cup’, splash, dripContamination: carry-over, dust, hairs, crystals.Etc etc….

Page 14: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Slides• Cracking• Splitting• Exfoliating• Fluorescing

• Coatings - Hydrophobic, hydrophilic, correctly aged poly-lysine (a bit of an art)• Home-made vs bought (cost of internal vs external quality control.• Scan before coating, scan after coating, scan after arraying, scan after hyb-ing all part of QC•Etc…etc...

Page 15: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

The finished spotted array

Before processing, we have a LOT of spots

Page 16: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

After processing, we have a LOT of objective data

Example Hybridisation

Page 17: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

What biological questions can you answer with arrays ?

Page 18: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

5 hormone response gene family members

In different experiments

3. Root vs shoot hyb

1. +hormone vs ctrl hyb

2. Normal vs mutant hyb

microarray

Sorting out gene families

Page 19: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

The original choice was:

Mass amplifications of cDNAs identified by partial sequence

(ESTs)

What goes on the slide ?

Page 20: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

However ….. Duplication in genomes is a real problem

Human

PlantYeast

Page 21: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Gene families:(# of members as a proportion of the genome)

Apart from wholesale duplication

Unique 2 3 4 5 >5

35% 12.5% 7% 4.4% 3.6% 37.4%

Conservation between genes:

• 37% of genes are highly conserved (TBLASTX E<10-30)

• 10% more are partially conserved(TBLASTX E<10-5)

Page 22: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Gene of interest

ESTs have inherent problems

Example EST sequence 1

Homologous EST sequence 2

Dissimilar EST sequence 3

On the slide

1

2

3

Labelled target may hybridise similarly to each

Page 23: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Better solutions:

• GSTs (gene specific tags)• Oligo arrays• Affymetrix genechips

• RNA seq???

Page 24: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Selection of Expression Probes

Probes

Sequence

Perfect Match

MismatchChip

5’ 3’

Page 25: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

AffymetrixWafer and Chip Format

1.28cm

5 - 50 µm

5 - 50 µm

Millions of identical oligonucleotide

probes per feature

49 - 400 chips/wafer

up to ~ 3,000,000 features/chip

Page 26: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Probe cells of an Affymetrix GeneChip contain millions of identical 25-mers

25-mer

Page 27: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Photolithographic Synthesis

Lamp

Mask Chip

Page 28: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Synthesis of Ordered Oligonucleotide Arrays

One nucleotide at a time.

here

Page 29: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Procedures for Target Preparation

RNAAAAA

RNA Quality control

Page 30: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Procedures for Target Preparation

cDNA

Wash & Stain

Scan

Hybridise

(16 hours)

RNAAAAA

B B B B

Biotin-labeled transcripts Fragment

(heat, Mg2+)

Fragmented cRNA

B B

B

B

IVT(Biotin-UTPBiotin-CTP)

Page 31: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

GeneChip® Expression AnalysisHybridization and Staining

Array

cRNA Target

Hybridized Array

Ab detection

Page 32: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Affymetrix software derives the intensity for each probe from the 75% quantile of the pixel values in

each box.

Page 33: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

The intensities of the multiple probes within a probeset are combined into ONE measure of expression

Expression Measure

Page 34: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Chips need to be normalised against each other.

Each chip is a different colour in this graph

They are not co-incident for

intensities

To compare they need to

be comparable

Page 35: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

RMA uses normalisation at the probe

level

Chip 1

Chip 2

Chip 3

1 2 3 4 5

1 2 3 5 7

2 3 4 5 9

Order by ranks

PA PB PC PD PE

Chip 1

Chip 2

Chip 3

1 2 4 3 5

7 2 5 3 1

5 3 4 2 9

Average the intensities at each rank

Chip 1

Chip 2

Chip 3

1.33 2.33 3.33 4.66 7

1.33 2.33 3.33 4.66 7

1.33 2.33 3.33 4.66 7

PA PB PC PD PE

Chip 1

Chip 2

Chip 3

1.33 2.33 4.66 3.33 7

7 2.33 4.66 3.33 1.33

4.66 2.33 3.33 1.33 7

Reorder by probe

Page 36: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

R / BioConductor

training

AffylmGUItraining

Xspecies analysistraining

Normalisation, filtering and annotation

.CDF , filtering, stats and annotation

RMA Normalisation

Page 37: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Sequencing: current / next gen / future

Sequencing is likely to complement arrays in the future

Page 38: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Standard (Sanger) sequencing

TemplateTemplate

PrimerPrimer

Primer

Primer

Random ddNTP termination.

Label can be added to the:

• Primer• ddNTP –or-• Incorporated dNTPs

Page 39: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

454 sequencing (images by Roche) Sample Input and Fragmentation: Genomic DNA or BACs are fractionated into small, 300- to 800-basepair fragments

Library Preparation: Short adaptors (A and B) - specific for both the 3' and 5' ends - are added to each single stranded fragment.

One Fragment = One Bead: Each fragment of the single-stranded DNA library is immobilized individually onto beads in a water-in-oil mixture.

emPCR (Emulsion PCR) Amplification: Each unique fragment is amplified in parallel to several million per bead.

One Bead = One Read: The clonally amplified fragments are loaded onto a PicoTiterPlate device for sequencing. Only one bead per well.

Auto fluidics flows individual nucleotides in a fixed order across the hundreds of thousands of wells containing one bead. Addition of a nucleotide results in a chemiluminescent signal.

Page 40: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Solexa sequencing ISeries of images taken from www.illumina.com

Page 41: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Solexa sequencing II

Page 42: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Solexa sequencing III

Page 43: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

But the future may be even faster……

• http://www.pacificbiosciences.com/aboutus/video-gallery

• Note: Direct link may be disallowed by the server.– try direct paste into a browser and click the SMRT Biology Overview in the video-gallery archive

Page 44: Arrays against time Transcriptomics ‘101’ Wuhan 2011 CCC

Rubber sequencing