![Page 1: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/1.jpg)
Genomics Technologies Suitable for Population Studies
Michael Snyder
December 18, 2007
GGTTCCAAAAGTTTATTGGATGCCGTTTCAGTACATTTATCGTTTGCTTTGGATGCCCTAATTAAAAGTGACCCTTTCAAACTGAAATTCATGATACACCAATGGATATCCTTAGTCGATAAAATTTGCGAGTACTTTCAAAGCCAAATGAAATTATCTATGGTAGACAAAACATTGACCAATTTCATATCGATCCTCCTGAATTTATTGGCGTTAGACACAGTTGGTATATTTCAAGTGACAAGGACAATTACTTGGACCGTAATAGATTTTTTGAGGCTCAGCAAAAAAGAAAATGGAAATACGAGATTAATAATGTCATTAATAAATCAATTAATTTTGAAGTGCCATTGTTTTAGTGTTATTGATACGCTAATGCTTATAAAAGAAGCATGGAGTTACAACCTGACAATTGGCTGTACTTCCAATGAGCTAGTACAAGACCAATTATCACTGTTTGATGTTATGTCAAGTGAACTAATGAACCATAAACTTGGTCATTCG
![Page 2: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/2.jpg)
Genomics Technologies
1)Mapping DNA sequence variationvariation
2) Mapping transcripts2) Mapping transcripts
3) Mapping regulatory information) g g y
![Page 3: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/3.jpg)
Genome Tiling Arrays
25 or 50mer
Massively Parallel Sequencing
AGTTCACCTAAGA…
CTTGAATGCCGAT…
GTCATTCCGCAAT…
![Page 4: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/4.jpg)
Human Variation:SNP Mapping
Methods:
Affymetrix 6.0900K SNPs + CNVs
Illumina 1M SNP + CNV1M SNPs + CNVs
Sequencing basedSequencing based Methods
![Page 5: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/5.jpg)
Mapping Structural Variation in Humans
- Thought to be Common12% of the genome12% of the genome (Redon et al. 2006)
- Likely involved in phenotype y p ypvariation and disease
- Most methods for detection are CNVs
low resolution (>50 kb)
![Page 6: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/6.jpg)
High Resolution Comparative Genomic Hybridization
Chromosome 22Chromosome 22
Array Based Methods:BAC arraysBAC arrays
Affy & Illumina ~50kb
Nimblegen 400 K Array
Person 1
igna
l
Nimblegen 400 K Array
1 array (>60kb)
8 array set (>10 kb?)
Person 2Si
8 array set (>10 kb?)
2.1 M feature arrays
1 array (>5 kb?)
Chromosome Position
1 array (>5 kb?)Urban et al. (2006) PNAS
![Page 7: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/7.jpg)
High Resolution-Paired-End Mapping (HR-PEM)
CircularizeShear to 3 kbAdaptor ligation Bio Bio
Genomic DNA Fragments
Bio
Random Cleavage Bio
Bio
Bio
454 M i l P ll l S i
200-300bp
454 Massively Parallel Sequencing (250bp reads, 400 000 reads/run)
Map paired ends to human reference genomeMap paired ends to human reference genome
![Page 8: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/8.jpg)
Summary of PEM Results
NA15510 (European?, female)
NA18505 (Yoruba, female)
# of sequence reads > 10 M > 21 M# of sequence reads > 10 M. > 21 M.
Paired ends uniquely mapped > 4.2 M. > 8.6 M.
Fold coverage ~ 2.1x ~ 4.3x
Predicted Structural 478 839Variants*IndelsInversion breakpoints
47842751
83975881
Estimated total variants*genome wide
759 902genome-wide
*at this resolution
![Page 9: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/9.jpg)
Size distribution of Structural Variants[NA15510 excluding complex rearrangements and inversion breakpoints]
400
450
400
[NA15510, excluding complex rearrangements, and inversion breakpoints]
300
350
400
300
400
bin
150
200
250
200
ount
for e
ach
50
100
150
100
Co
0 1 2 3 4 5 6 7 8 9 10
x 104
0
>10020 40 60 800Sizes of predicted deletions and insertions [kb]
![Page 10: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/10.jpg)
Chromatin Immunoprecipitation p50
p65
Crosslink with
p50p65
Sonicate
Formaldehyde
p50
p65IP with
Specific AbSpecific Ab
Reverse Crosslinkingg
Label DNA and
Hybridize toMassive
DNAHybridize to Microarray
DNA Sequencing
![Page 11: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/11.jpg)
Pol II: Chromosome 22 S f T kSummary of Tracks
HeLa S3 Rep1
HeLa S3 Rep2HeLa S3 Rep2
HeLa S3 Rep3
HeLa S3 Input
QuickTime™ and a decompressor
are needed to see this picture.
HEK293 Rep1
HEK293 Rep2
HEK293 Rep3QuickTime™ and a
decompressorare needed to see this picture.
HEK293 Input
K562 Rep2
K562 Rep3QuickTime™ and a
NB4 Rep1
decompressorare needed to see this picture.
K562 Input
QuickTime™ and a decompressor
are needed to see this picture.
NB4 Rep2
NB4 Rep3
![Page 12: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/12.jpg)
EpigenomicsEpigenomics
1)Mapping Chromatin Modifications
2)Mapping DNA Methylation
![Page 13: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/13.jpg)
Mapping Transcribed Regions
25 or 36mer
• Probe Affymetrix or Nimblegen Arrays with cDNA
• >50% of transcribed regions not annotated
![Page 14: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/14.jpg)
ENCODE Project: Transcribed and Regulatory Regions
![Page 15: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/15.jpg)
F t Di tiFuture Directions
1) Whole genome sequencing.
2) Whole genome mapping of transcribed regions and regulatory information inregions and regulatory information in small samples.
3) Proteomics approaches.
![Page 16: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/16.jpg)
Technologies for Large Population Studies: Possible Recommendations
1) High density SNPs1) High density SNPs2) High resolution CNVs/SVs 2 kb or larger3) Whole genome gene expression (coding and3) Whole genome gene expression (coding and
noncoding). What tissues?4) Sera proteome and metabolomics analysis4) Sera proteome and metabolomics analysis.
Other fluids?5) Whole genome TF/Methylation mapping. 5) o e ge o e / et y at o app g
Which ones? What tissues?
![Page 17: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/17.jpg)
![Page 18: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/18.jpg)
STAT1 x 3 Reps & Array (IRF1)y ( )
Array
B1
B2B2
B3
Mapped > 40K STAT1 Binding Sites
![Page 19: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/19.jpg)
STAT1 ChIP-Seq on IFN Treated HeLa Cells using Illumina Sequencing
with the University of British ColumbiaChIP-chip (NimbleGen 2.1 M feature arrays)ChIP-seq (15.1 M mapped reads for the IFNγ)
with the University of British Columbia
ChIP-ChIP: IFNγ/ unstim
ChIP-seq: IFNγ
ChIP-seq: unstim
Robertson et al. 2007
![Page 20: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/20.jpg)
ENCODE Pilot PhaseENCODE Pilot Phase(Analyze 1% of the Human Genome)
Mapping transcribed regions 13 E i t13 Experiments
Mapping regulatory information >150 ChIP Chip experiments>150 ChIP Chip experimentsDNAase hypersensitive sites
![Page 21: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/21.jpg)
Massively Parallel SequencingMassively Parallel Sequencing Technologies
454400K d400K reads 250 b
Illumina and ABI>40M reads 36 b
![Page 22: Genomics Technologies Suitable for Population Studies · 2008-02-21 · Genomics Technologies 1)Mapping DNA sequence variation 2) Mapping transcripts2) Mapping transcripts 3))ggy](https://reader034.vdocuments.pub/reader034/viewer/2022043011/5fa60020e6e3176fe8740d7d/html5/thumbnails/22.jpg)
Distribution of SVs
**
*
VCFS
*