viral metagenomics (cabbio 20150629 buenos aires)
TRANSCRIPT
Bas E. DutilhBacteriófagos: Aspectos básicos y moleculares. Aplicaciones Biotecnológicas
Buenos Aires, June 29th 2015
Viral metagenomics
Metagenomics
Sample
Filter
Microbesor viruses
Viruses and phages
Metagenomics
Sample
Filter
Microbesor viruses
Who/what is there?
Who/what is there?
Metagenomics 1.0: database mapping• General database
• Custom database– MetaHIT human gut catalogue
– Omnibus of Marine Genes
Taxonomic and functional profiling
Sunagawa et al. Science 2015
Metagenomics 2.0: genomes from metagenomes• Reference databases fail for most environmental
metagenomes– “Dark matter”: sequences not in database
• Homology searches fail for many short sequencing reads– Fast read alignment tools place upper limit on
evolutionary distance
• Interpretation fails for taxonomic and functional metagenomic profiles– Do functions co-occur in a genome?– To describe interactions between species you need
species/genome-level resolution
• Solution: assembly and binning of (draft) genomes from metagenomes
Biological “dark matter”
“Depending on how they are viewed,the unknowns can represent either a formidable challenge
or a treasure trove for virus discovery.”Mokili, Rohwer and Dutilh Curr. Opin. Virology 2012
Unknowns in viral metagenomes
Mokili, Rohwer and Dutilh Curr. Opin. Virology 2012
T1 T2
M
F2T1 T2
M
F4T1 T2
M
F1T1 T2
M
F3
Reyes et al. Nature 2010
Pathway presence profiles
Virome readsmapping to viral database
Reyes et al. Nature 2010
1 2 3 4 5 6 7 8 9 10 11 121
10
100
1000
10000
Number of samples contributing reads to contig
Num
ber o
f con
tigs
De novo assembled contigs
6,988 de novo cross-contigs
Family 1 Family 2 Family 3
F1M F1T1 F1T2 F2M F2T1 F2T2 F3M F3T1 F3T2 F4M F4T1 F4T2
220 contigs present in ≥9 samples
Family 4
Aver
age
dept
h →
Samples →
1 2 3 4 5 6 7 8 9 10 11 121
10
100
1000
10000
crAssphage97,065 bp
29.3% G+C80 ORFs
CaudoviralesDutilh et al. Nat. Comm. 2014
Dutilh et al. Nat. Comm. 2014
%
Genomic position: 0 – 97,065 nt
2,906 metagenomes (940)blastn≥95% identity≥75 bp hitunambiguously aligned
2,906 metagenomes1,193 phages (861)blastn≥95% identity≥75 bp hitunambiguously aligned
Ubiquity/abundance plot
Dutilh et al. Nat. Comm. 2014
??????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
Virome readsmapping tocrAssphage
Viral database vs crAssphage
Virome readsmapping to viral database
Dick et al. Genome Biol. 2009
Binning
Paired-end reads
Iverson et al. Science 2012
Binning: Hi-C sequencing
Beitel et al. PeerJ 2014
• Cross-link physically proximal DNA (formalin)
• Restrict with enzyme• Ligate restricted sites• Sequence ligation sites
K-mer binning
Ghai et al. Sci. Rep. 2011
K-mer (k=4) maps (ESOM)
Abe et al. DNA Res. 2005
Depth binning
Speth et al. Frontiers Microbiol 2012
Depth profile binning
Wrighton et al. Science 2012
F1M F1T1 F1T2 F2M F2T1 F2T2 F3M F3T1 F3T2 F4M F4T1 F4T2
Partial nitritation/ anammox reactor (600 m3)
5.0 m
0.2 m
1.4 m
2.6 m
3.8 m
untreated
washed granules
12
34
56
78
washed granules
DNA isolation
Organic extraction
Powersoil kit
Organic extraction
Powersoil kit
Organic extraction
Powersoil kit
Organic extraction
Powersoil kit
Sample treatmentSample location DNA isolation
untreated
Differential sampling and DNA extraction
Speth et al. submitted