toast 2015 qiime_talk2

30
QIIME: Quantitative Insights Into Microbial Ecology (part 2) Thomas Jeffries Federico M. Lauro Grazia Marina Quero Tiziano Minuzzo The Omics Analysis Sydney Tutorial Australian Museum 23 rd -24 th February 2015

Upload: toastworkshop

Post on 29-Jul-2015

296 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Toast 2015 qiime_talk2

QIIME: Quantitative Insights Into Microbial Ecology (part

2) Thomas JeffriesFederico M. Lauro

Grazia Marina Quero Tiziano Minuzzo

The Omics Analysis Sydney Tutorial

Australian Museum 23rd-24th February 2015

Page 2: Toast 2015 qiime_talk2

Recap

• Rarefied, Chimera Filtered O.T.U. table:

Summarize Taxonomy

Use for β-Diversity e.g. Bray-Curtis clustering in PRIMER

• Phylogenetic tree

Use for phylogenetic β-Diversity e.g. UniFrac

Page 3: Toast 2015 qiime_talk2

Visualizing diversity 1 – community composition

• Summarizes taxa at hierarchical taxonomic levels:

summarize_taxa_through_plots.py -i otu_table_even146.biom -o wf_taxa_summary –m my_mapping_file.txt

Page 4: Toast 2015 qiime_talk2

Hands on – community composition

What taxa are present?

summarize_taxa_through_plots.py -i moving_pictures_tutorial-1.8.0/illumina/otus_denovo/otu_table_even138.biom -o moving_pictures_tutorial-1.8.0/illumina/otus_denovo/wf_taxa_summary -m moving_pictures_tutorial-1.8.0/illumina/combined_mapping_file.txt

Page 5: Toast 2015 qiime_talk2

Visualizing diversity 1 – community composition

• Input your final O.T.U table and your mapping file

• Summarizes taxa relative abundance at hierarchical taxonomic levels: Linnaean (K,P,C,O,F,G,S) (makes spreadsheets)

• Can open in Excel, R, PRIMER e.t.c. and do what you want with them

summarize_taxa.py -i otu_table_even.biom -o /taxa –m my_mapping_file.txt

Page 6: Toast 2015 qiime_talk2

• β-diversity compares diversity between each sample in your study

• i.e. make a matrix of overall similarity between each sample which can be visualized – what diversity is shared?

• Abundance based metrics e.g. Bray-Curtis (differences in rank-abundance of taxa) I generally use in PRIMER .e.t.c but can be done in QIIME

Visualizing diversity 2 – phylogenetic beta-diversity

Page 7: Toast 2015 qiime_talk2

Visualizing diversity 2 – phylogenetic beta-diversity

• Divergence-based measures: communities are considered more related if the taxa they contain are more closely related.

• UniFrac (qualitative): Measures phylogenetic distance between sets of taxa in a tree (proportion of overall phylogenetic brach length shared between samples) (Lozupone et al, 2011, ISMEJ)

• Weighted UniFrac (quantitative): Variation of UniFrac that accounts for changes in relative abundance of lineages between communities

Page 8: Toast 2015 qiime_talk2

• Why do we care about UniFrac?

• Because abundance differences in closely related taxa may not have as bigger implication as diversity shifts in more divergent taxa (i.e. most metrics treat each taxa equally)

Visualizing diversity 2 – phylogenetic beta-diversity

• Determine if community differences are concentrated within particular lineages of the phylogenetic tree.

Page 9: Toast 2015 qiime_talk2

• Cluster environments to determine whether there are environmental factors (such as temperature or salinity, body location) that group communities together

• It is also very discriminating and makes pretty pictures

Human microbiome project consortium, 2012, Nature

Lozupone & Knight, 2007, PNAS

Caporaso et al, 2012, PNAS

Visualizing phylogenetic beta-diversity

Page 10: Toast 2015 qiime_talk2

beta_diversity.py -i otu_table_rarefied.biom –m weighted_unifrac –o beta_div –t rep_set.tre

• Takes your final O.T.U. table and your phylogenetic tree and makes a distance matrix based on UniFrac

• Can be imported into PRIMER e.t.c. or used in QIIME to make plots .e.g. 2D PcOA or 3D Emperor plots

beta_diversity_through_plots.py -i otu_table.biom -m my_mapping_file.txt -o wf_bdiv_even146/ -t rep_phylo.tre -e 146

• Will automate the process (note the –e rarefies so use unrarefied table) but sometimes plots e.t.c. can be dodgy depending on what python packages etc you have

Visualizing diversity 3 – phylogenetic beta-diversity

Page 11: Toast 2015 qiime_talk2

• A cool and useful way to visualize UniFrac is to use emporer PCoA plots:

• Generate the principle components from your distance matrix:

principal_coordinates.py -i beta_div.txt -o beta_div_coords.txt

• Make plot from coordinates and mapping file : make_emperor.py –i beta_div_coords.txt –m mappingfile.txt –o emperor

• Will make .html - open using Chrome browser: can colour by any of the factors in your mapping file - also export .png e.t.c.

Visualizing diversity 3 – emperor plots

Page 12: Toast 2015 qiime_talk2

core_diversity_analyses.py -o moving_pictures_tutorial-1.8.0/illumina/otus/cd258/ --

suppress_alpha_diversity -i moving_pictures_tutorial-1.8.0/illumina/otus/otu_table_mc2_w_tax_n

o_pynast_failures.biom -m moving_pictures_tutorial-1.8.0/illumina/combined_mapping_file.txt -t

moving_pictures_tutorial-1.8.0/illumina/otus/rep_set.tre -e 258 -c "SampleType,days_since_epoch"

Hands on – diversity analysis

Page 13: Toast 2015 qiime_talk2

O.T.U. networks

make_otu_network.py -i otu_table.biom -m map.txt -o otu_network

•Makes a network displaying what OTUs are shared between samples

•Input for cytoscape: http://qiime.org/tutorials/making_cytoscape_networks.html

•Network tutorial…..

Page 14: Toast 2015 qiime_talk2

Hands on –making a network

make_otu_network.py -i moving_pictures_tutorial-1.8.0/illumina/otus_denovo/otu_table.biom -o moving_pictures_tutorial-1.8.0/illumina/otus_denovo/otu_network -m moving_pictures_tutorial-1.8.0/illumina/combined_mapping_file.txt

Visualizing in cytoscape :

http://qiime.org/tutorials/making_cytoscape_networks.html

Page 15: Toast 2015 qiime_talk2

• That’s the core workflow…..QIIME has many other functions:

• http://qiime.org/1.8.0/scripts/

• Useful functions for manipulating sequence data (eg filtering, sorting, format changes)

• Stats…..

Page 16: Toast 2015 qiime_talk2

Software references: QIIME Caporaso et al 2010. QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7(5): 335-336.

UCLUST Edgar RC. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26(19):2460-2461.

BLAST Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215(3):403-410.

GRENGENES McDonald et al 2012. An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J 6(3): 610–618.

RDP Classifier Wang Q, Garrity GM, Tiedje JM, Cole JR. 2007. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microb 73(16): 5261-5267.

PyNAST Caporaso JG et al 2010. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26:266-267.

ChimeraSlayer Haas BJ, Gevers D, Earl AM, Feldgarden M, Ward DV, Giannoukos G, et al. 2011. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Research 21:494-504.

MUSCLE Edgar, R.C. 2004 MUSCLE: multiple sequence alignment with high accuracy and high throughput Nucleic Acids Res:1792-1797

FasttTree Price MN, Dehal PS, Arkin AP. 2010. FastTree 2-Approximately Maximum-Likelihood Trees for Large Alignments. Plos One 5(3)

UNIFRAC Lozupone C, Knight R. 2005. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71(12): 8228-8235.

Emperor Vazquez-Baeza Y, Pirrung M, Gonzalez A, Knight R. 2013. Emperor: A tool for visualizing high-throughput microbial community data. Gigascience 2(1):16.

Page 17: Toast 2015 qiime_talk2

-OMICs are not the -OMICs are not the answer…answer…

……unless you are asking the unless you are asking the right questionright question

Page 18: Toast 2015 qiime_talk2

Sampling design - Temporal

Page 19: Toast 2015 qiime_talk2

Sampling design - Spatial

Page 20: Toast 2015 qiime_talk2
Page 21: Toast 2015 qiime_talk2
Page 22: Toast 2015 qiime_talk2
Page 23: Toast 2015 qiime_talk2

Network Analysis Terminology

NODE – a variable

EDGE – a connection / interaction between 2 variables

MODULE – a defined set of nodes and edges

We can describe any interactions….

Page 24: Toast 2015 qiime_talk2

Network Analysis

Page 25: Toast 2015 qiime_talk2

Network Analysis

Page 26: Toast 2015 qiime_talk2

Network Analysis

Page 27: Toast 2015 qiime_talk2

Network Analysis

Page 28: Toast 2015 qiime_talk2

New trends in networks:

• Scale free networks – “hubs” are nodes with a high degree of connectivity e.g. google, keystone bacteria that strongly correlate with environmental variables

• Comparative network analysis – i.e. resilience and connectivity

• What is the minimum amount of information we need to predict microbial community dynamics? …remote sensing, models e.t.c.

Page 29: Toast 2015 qiime_talk2

Some Other Softwares to play with…

PRIMER-E http://www.primer-e.com/

Phylosifthttp://phylosift.wordpress.com/

GroopM http://minillinim.github.io/GroopM/

ARB Everything you needed to know from Ramon

R http://www.r-project.org/

Cytoscape http://www.cytoscape.org/

Page 30: Toast 2015 qiime_talk2

Acknowledgements

Ziggy Marzinelli – Experimental Design

Jason Woodhouse & Mark Brown – Network Analysis

Contact:

[email protected]

[email protected] (UNSW – Sydney)

[email protected] (NTU – Singapore)