bioinformatics, genomics and transcriptomic case...

15
Bioinformatics, genomics and transcriptomic case study 11. Dec. 2018 Arnar Pálsson Instute of biology

Upload: others

Post on 06-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

Bioinformatics, genomics and transcriptomic case study

11. Dec. 2018

Arnar Pálsson

Institute of biology

Page 2: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

2

Acknowledgements

Baldur Kristjánsson

Jóhannes Guðbrandsson

Dagný Ásta Rúnarsdóttir

Ian Dworkin, McMaster Uni.

Sudarshan Chari, Princeton Uni.

Page 3: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

3

Genomics and bioinformaticsCourses and workshops

● LIF659M 6 ECTS

– Basic concepts of genomics - introduction to bioinformatics

● TÖL504M 6 ECTS

– the algorithmic aspects of bioinformatics

● Next-Gen Sequence Analysis Workshop

– https://angus.readthedocs.io/en/2018/

● EMBL courses

– https://www.embl.de/training/events/

● Wellcome trust courses

– https://coursesandconferences.wellcomegenomecampus.org/our-events/

● EDX

– https://www.edx.org/learn/bioinformatics

Page 4: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

4

Principles for transcriptome (genomic) analyses

● There are a few basic points to always keep in mind:

– Biological replication.

– Design experiment to avoid confounding variables.

– Sample individuals (within treatment) randomly!

● For experimental design :

● – Quinn & Keough: Experimental Design and data analysis for biologists

Page 5: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

5

The basics of experimental design

● There are a few basic points to always keep in mind:

– Biological replication (as much as you can afford) is extremely important. To robustly identify differentially expressed (DE) genes requires statistical powers.

– (note: this is not how many reads you have for a gene within a sample, but how many biologically/statistically

● independent samples per treatment).

– Technical replication does not help with statistical power (i.e. don’t split a single sample and run as two libraries).

Page 6: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

6

Biological replication gives far more statistical power than increased

sequencing depth withina biological sample!!!!

Page 7: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

7

Design your experiment to avoid confounding your different treatments (sex, nutrition) with each other or with technical variables (lane withina flow cell, between flow cell variation).

• Make diagrams/tables of your experimental design, or use a randomized design.

Page 8: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

8

Auer Doerge Genetics 2010http://www.genetics.org/content/185/2/405.figures-only

Page 9: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

9

A simple truth

There is no technology nor statistical wizardry that can save a poorly planned experiment. The only truly failed experiment is a poorly planned one.

● To consult the statistician after an experiment is finished is often merely to ask him(her) to conduct a post mortem examination. He(she) can perhaps say what the experiment died of.

● Ronald Fisher

Page 10: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

Background Research System Material & Methods Results Conclusion

Case study: compensation of wing defects in flies

FVW

Base

CAS NASC

10

Page 11: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

11

Sampling and library preparation

● Crawling 3rd instar larvae – close to pupation● Wing-discs, epithelial bags of 30-50,000 cells

● MagMAXTM-96 for Microarrays Total RNA – Bioanalyzer, RNA integrity and quantity

● Illumina: TruSeq® RNA Sample Prep kit V2– mRNA to cDNA– MiSeq (Askja), HiSeq (Decode)

Page 12: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

12

Biological and technical variance

Page 13: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

Computational analysis

• Quality control by FastQC• Sequenced reads were mapped to the

Drosophila transcriptome/genome and counted i. Kallistoii. STAR-HTSeq

• Statistical analysis were done in R i. Sleuthii. DESeq2

• Estimate and correct for variance/over-dispersion (DESeq2)

13

Page 14: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

14

Conclusion● Design experiments carefully

– Transcriptome, chip-seq, methyl-seq, whatever...– Avoid confounding variables, use blocks, randomization within– Account for technical factors (kits, people, isolations, lanes...)– Analyse whole datasets – account for variance / correct for

overdispersion

● Transcriptome analysis– More biological replicates (4-10)– Rather drop groups rather than sacrifice bio. replicates!– List your predictions before hand– Set up specific contrasts – Analyze data on transcriptome and gene level

Page 15: Bioinformatics, genomics and transcriptomic case studylifvisindi.hi.is/sites/lifvisindi.hi.is/files/bioinffyrirlestur2018short_ap.pdf · Sudarshan Chari, Princeton Uni. 3 Genomics

What do we expect? Outline Theoretical response

FVW

Base

CAS

NASC

15

● ● ●

FVW Base CAS NASC

No compensation

FVW Base CAS NASC

Expr

essi

on

Wing compensation

● ●

FVW Base CAS NASC

Wing indirect compensation

● ●

FVW Base CAS NASC

Full compensation

● ●

FVW Base CAS NASC

Fitness compensation

● ● ●

FVW Base CAS NASC

Fitness indirect compensation

FVW Base CAS NASC

No compensation

FVW Base CAS NASC

Expr

essio

n

Wing compensation

FVW Base CAS NASC

Wing indirect compensation

FVW Base CAS NASC

Full compensation

FVW Base CAS NASC

Fitness compensation

FVW Base CAS NASC

Fitness indirect compensation