bioinformatics, genomics and transcriptomic case...
TRANSCRIPT
Bioinformatics, genomics and transcriptomic case study
11. Dec. 2018
Arnar Pálsson
Institute of biology
2
Acknowledgements
Baldur Kristjánsson
Jóhannes Guðbrandsson
Dagný Ásta Rúnarsdóttir
Ian Dworkin, McMaster Uni.
Sudarshan Chari, Princeton Uni.
3
Genomics and bioinformaticsCourses and workshops
● LIF659M 6 ECTS
– Basic concepts of genomics - introduction to bioinformatics
● TÖL504M 6 ECTS
– the algorithmic aspects of bioinformatics
● Next-Gen Sequence Analysis Workshop
– https://angus.readthedocs.io/en/2018/
● EMBL courses
– https://www.embl.de/training/events/
● Wellcome trust courses
– https://coursesandconferences.wellcomegenomecampus.org/our-events/
● EDX
– https://www.edx.org/learn/bioinformatics
4
Principles for transcriptome (genomic) analyses
● There are a few basic points to always keep in mind:
– Biological replication.
– Design experiment to avoid confounding variables.
– Sample individuals (within treatment) randomly!
● For experimental design :
● – Quinn & Keough: Experimental Design and data analysis for biologists
5
The basics of experimental design
● There are a few basic points to always keep in mind:
– Biological replication (as much as you can afford) is extremely important. To robustly identify differentially expressed (DE) genes requires statistical powers.
– (note: this is not how many reads you have for a gene within a sample, but how many biologically/statistically
● independent samples per treatment).
– Technical replication does not help with statistical power (i.e. don’t split a single sample and run as two libraries).
6
Biological replication gives far more statistical power than increased
sequencing depth withina biological sample!!!!
7
Design your experiment to avoid confounding your different treatments (sex, nutrition) with each other or with technical variables (lane withina flow cell, between flow cell variation).
• Make diagrams/tables of your experimental design, or use a randomized design.
8
Auer Doerge Genetics 2010http://www.genetics.org/content/185/2/405.figures-only
9
A simple truth
There is no technology nor statistical wizardry that can save a poorly planned experiment. The only truly failed experiment is a poorly planned one.
● To consult the statistician after an experiment is finished is often merely to ask him(her) to conduct a post mortem examination. He(she) can perhaps say what the experiment died of.
● Ronald Fisher
Background Research System Material & Methods Results Conclusion
Case study: compensation of wing defects in flies
FVW
Base
CAS NASC
10
11
Sampling and library preparation
● Crawling 3rd instar larvae – close to pupation● Wing-discs, epithelial bags of 30-50,000 cells
● MagMAXTM-96 for Microarrays Total RNA – Bioanalyzer, RNA integrity and quantity
● Illumina: TruSeq® RNA Sample Prep kit V2– mRNA to cDNA– MiSeq (Askja), HiSeq (Decode)
12
Biological and technical variance
Computational analysis
• Quality control by FastQC• Sequenced reads were mapped to the
Drosophila transcriptome/genome and counted i. Kallistoii. STAR-HTSeq
• Statistical analysis were done in R i. Sleuthii. DESeq2
• Estimate and correct for variance/over-dispersion (DESeq2)
13
14
Conclusion● Design experiments carefully
– Transcriptome, chip-seq, methyl-seq, whatever...– Avoid confounding variables, use blocks, randomization within– Account for technical factors (kits, people, isolations, lanes...)– Analyse whole datasets – account for variance / correct for
overdispersion
● Transcriptome analysis– More biological replicates (4-10)– Rather drop groups rather than sacrifice bio. replicates!– List your predictions before hand– Set up specific contrasts – Analyze data on transcriptome and gene level
What do we expect? Outline Theoretical response
FVW
Base
CAS
NASC
15
●
● ● ●
FVW Base CAS NASC
No compensation
●
●
●
●
FVW Base CAS NASC
Expr
essi
on
Wing compensation
● ●
●
●
FVW Base CAS NASC
Wing indirect compensation
●
●
● ●
FVW Base CAS NASC
Full compensation
●
● ●
●
FVW Base CAS NASC
Fitness compensation
● ● ●
●
FVW Base CAS NASC
Fitness indirect compensation
FVW Base CAS NASC
No compensation
FVW Base CAS NASC
Expr
essio
n
Wing compensation
FVW Base CAS NASC
Wing indirect compensation
FVW Base CAS NASC
Full compensation
FVW Base CAS NASC
Fitness compensation
FVW Base CAS NASC
Fitness indirect compensation