agccaagcagcaaagttttgctgctgtttatttttgtagctcttactatattctacttttacca ... · 2017. 5. 9. · deep cnns...

25
AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA TTGAAAATATTGAGGAAGTTATTTATATTTCTATTTTTTATATATTATATATTTTATGTATTTTAAT ATTACTATTACACATAATTATTTTTTATATATATGAAGTACCAATGACTTCCTTTTCCAGAGCAA TAATGAAATTTCACAGTATGAAAATGGAAGAAATCAATAAAATTATACGTGACCTGTGGCGA AGTACCTATCGTGGACAAGGTGAGTACCATGGTGTATCACAAATGCTCTTTCCAAAGCCCTC TCCGCAGCTCTTCCCCTTATGACCTCTCATCATGCCAGCATTACCTCCCTGGACCCCTTTCTAA GCATGTCTTTGAGATTTTCTAAGAATTCTTATCTTGGCAACATCTTGTAGCAAGAAAATGTAA AGTTTTCTGTTCCAGAGCCTAACAGGACTTACATATTTGACTGCAGTAGGCATTATATTTAGC TGATGACATAATAGGTTCTGTCATAGTGTAGATAGGGATAAGCCAAAATGCAATAAGAAAAA CCATCCAGAGGAAACTCTTTTTTTTTTCTTTTTCTTTTTTTTTTTTCCAGATGGAGTCTCGCA CTTCTCTGTCACCCGGGCTGGAGCGCAGTGGTGCAATCTTGGCTCACTGCAACCTCCACCT CCTGGGTTCAGGTGATTCTCCCACCTCAGCCTCCCGAGTAGTAGCTGGAATTACAGGTGCG CGCTCCCACACCTGGCTAATTTTTTGTATTCTTAGTAGAGATGGGGTTTCACCATGTTGGCCA GGCTGGTCTCAAACTCCTGCCCTCAGGTGATCTGCCCACCTTGGCCTCCCAGTGTTGGGTTT ACAGGCGTGAGCCACCGCGCCTGGCCTGGAGGAAACTCTTAACAGGGAAACTAAGAAAG AGTTGAGGCTGAGGAACTGGGGCATCTGGGTTGCTTCTGGCCAGACCACCAGGCTCTTGA ATCCTCCCAGCCAGAGAAAGAGTTTCCACACCAGCCATTGTTTTCCTCTGGTAATGTCAGCC TCATCTGTTGTTCCTAGGCTTACTTGATATGTTTGTAAATGACAAAAGGCTACAGAGCATAGA Deep learning approaches to decode the human genome Anshul Kundaje Genetics, Computer Science Stanford University http://anshul.kundaje.net

Upload: others

Post on 16-Aug-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCATTGAAAATATTGAGGAAGTTATTTATATTTCTATTTTTTATATATTATATATTTTATGTATTTTAATATTACTATTACACATAATTATTTTTTATATATATGAAGTACCAATGACTTCCTTTTCCAGAGCAATAATGAAATTTCACAGTATGAAAATGGAAGAAATCAATAAAATTATACGTGACCTGTGGCGAAGTACCTATCGTGGACAAGGTGAGTACCATGGTGTATCACAAATGCTCTTTCCAAAGCCCTCTCCGCAGCTCTTCCCCTTATGACCTCTCATCATGCCAGCATTACCTCCCTGGACCCCTTTCTAAGCATGTCTTTGAGATTTTCTAAGAATTCTTATCTTGGCAACATCTTGTAGCAAGAAAATGTAAAGTTTTCTGTTCCAGAGCCTAACAGGACTTACATATTTGACTGCAGTAGGCATTATATTTAGCTGATGACATAATAGGTTCTGTCATAGTGTAGATAGGGATAAGCCAAAATGCAATAAGAAAAACCATCCAGAGGAAACTCTTTTTTTTTTCTTTTTCTTTTTTTTTTTTCCAGATGGAGTCTCGCACTTCTCTGTCACCCGGGCTGGAGCGCAGTGGTGCAATCTTGGCTCACTGCAACCTCCACCTCCTGGGTTCAGGTGATTCTCCCACCTCAGCCTCCCGAGTAGTAGCTGGAATTACAGGTGCGCGCTCCCACACCTGGCTAATTTTTTGTATTCTTAGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGCCCTCAGGTGATCTGCCCACCTTGGCCTCCCAGTGTTGGGTTTACAGGCGTGAGCCACCGCGCCTGGCCTGGAGGAAACTCTTAACAGGGAAACTAAGAAAGAGTTGAGGCTGAGGAACTGGGGCATCTGGGTTGCTTCTGGCCAGACCACCAGGCTCTTGAATCCTCCCAGCCAGAGAAAGAGTTTCCACACCAGCCATTGTTTTCCTCTGGTAATGTCAGCCTCATCTGTTGTTCCTAGGCTTACTTGATATGTTTGTAAATGACAAAAGGCTACAGAGCATAGA

Deep learning approaches to decode the human genome

Anshul KundajeGenetics, Computer Science

Stanford University

http://anshul.kundaje.net

Page 2: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

TGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCATTGAAAATATTGAGGAAGTTATTTATATTTCTATTTTTTATATATTATATATTTTATGTATTTTAATATTACTATTACACATAATTATTTTTTATATATATGAAGTACCAATGACTTCCTTTTCCAGAGCAATAATGAAATTTCACAGTATGAAAATGGAAGAAATCAATAAAATTATACGTGACCTGTGGCGAAGTACCTATCGTGGACAAGGTGAGTACCATGGTGTATCACAAATGCTCTTTCCAAAGCCCTCTCCGCAGCTCTTCCCCTTATGACCTCTCATCATGCCAGCATTACCTCCCTGGACCCCTTTCTAAGCATGTCTTTGAGATTTTCTAAGAATTCTTATCTTGGCAACATCTTGTAGCAAGAAAATGTAAAGTTTTCTGTTCCAGAGCCTAACAGGACTTACATATTTGACTGCAGTAGGCATTATATTTAGCTGATGACATAATAGGTTCTGTCATAGTGTAGATAGGGATAAGCCAAAATGCAATAAGAAAAACCATCCAGAGGAAACTCTTTTTTTTTTCTTTTTCTTTTTTTTTTTTCCAGATGGAGTCTCGCACTTCTCTGTCACCCGGGCTGGAGCGCAGTGGTGCAATCTTGGCTCACTGCAACCTCCACCTCCTGGGTTCAGGTGATTCTCCCACCTCAGCCTCCCGAGTAGTAGCTGGAATTACAGGTGCGCGCTCCCACACCTGGCTAATTTTTTGTATTCTTAGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGCCCTCAGGTGATCTGCCCACCTTGGCCTCCCAGTGTTGGGTTTACAGGCGTGAGCCACCGCGCCTGGCCTGGAGGAAACTCTTAACAGGGAAACTAAGAAAGAGTTGAGGCTGAGGAACTGGGGCATCTGGGTTGCTTCTGGCCAGACCACCAGGCTCTTGAATCCTCCCAGCCAGAGAAAGAGTTTCCACACCAGCCATTGTTTTCCTCTGGTAATGTCAGCCTCATCTGTTGTTCCTAGGCTTACTTGATATGTTTGTAAATGACAAAAGGCTACAGAGCATAGGTTCCTCTAAAATATTCTTCTTCCTGTGTCAGATATTGAATACATAGAAATACGGTCTGATGCCGATGAAAATGTATCAGCTTCTGATAAAAGGCGGAATTATAACTACCGAGTGGTGATGCTGAAGGGAGACACAGCCTTGGATATGCGAGGACGATGCAGTGCTGGACAAAAGGCAGGTATCTCAAAAGCCTGGGGAGCCAACTCACCCAAGTAACTGAAAGAGAGAAACAAACATCAGTGCAGTGGAAGCACCCAAGGCTACACCTGAATGGTGGGAAGCTCTTTGCTGCTATATAAAATGAATCAGGCTCAGCTACTATTATT …………

2003

~ 3 billion nucleotides

The Human Genome Project

Page 3: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Population sequencing to identify disease-associated genetic variants

Statistically significant association?

Oxford Nanoporetechnology

Page 4: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

~ 3 billion nucleotides

TGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCATTGAAAATATTGAGGAAGTTATTTATATTTCTATTTTTTATATATTATATATTTTATGTATTTTAATATTACTATTACACATAATTATTTTTTATATATATGAAGTACCAATGACTTCCTTTTCCAGAGCAATAATGAAATTTCACAGTATGAAAATGGAAGAAATCAATAAAATTATACGTGACCTGTGGCGAAGTACCTATCGTGGACAAGGTGAGTACCATGGTGTATCACAAATGCTCTTTCCAAAGCCCTCTCCGCAGCTCTTCCCCTTATGACCTCTCATCATGCCAGCATTACCTCCCTGGACCCCTTTCTAAGCATGTCTTTGAGATTTTCTAAGAATTCTTATCTTGGCAACATCTTGTAGCAAGAAAATGTAAAGTTTTCTGTTCCAGAGCCTAACAGGACTTACATATTTGACTGCAGTAGGCATTATATTTAGCTGATGACATAATAGGTTCTGTCATAGTGTAGATAGGGATAAGCCAAAATGCAATAAGAAAAACCATCCAGAGGAAACTCTTTTTTTTTTCTTTTTCTTTTTTTTTTTTCCAGATGGAGTCTCGCACTTCTCTGTCACCCGGGCTGGAGCGCAGTGGTGCAATCTTGGCTCACTGCAACCTCCACCTCCTGGGTTCAGGTGATTCTCCCACCTCAGCCTCCCGAGTAGTAGCTGGAATTACAGGTGCGCGCTCCCACACCTGGCTAATTTTTTGTATTCTTAGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGCCCTCAGGTGATCTGCCCACCTTGGCCTCCCAGTGTTGGGTTTACAGGCGTGAGCCACCGCGCCTGGCCTGGAGGAAACTCTTAACAGGGAAACTAAGAAAGAGTTGAGGCTGAGGAACTGGGGCATCTGGGTTGCTTCTGGCCAGACCACCAGGCTCTTGAATCCTCCCAGCCAGAGAAAGAGTTTCCACACCAGCCATTGTTTTCCTCTGGTAATGTCAGCCTCATCTGTTGTTCCTAGGCTTACTTGATATGTTTGTAAATGACAAAAGGCTACAGAGCATAGGTTCCTCTAAAATATTCTTCTTCCTGTGTCAGATATTGAATACATAGAAATACGGTCTGATGCCGATGAAAATGTATCAGCTTCTGATAAAAGGCGGAATTATAACTACCGAGTGGTGATGCTGAAGGGAGACACAGCCTTGGATATGCGAGGACGATGCAGTGCTGGACAAAAGGCAGGTATCTCAAAAGCCTGGGGAGCCAACTCACCCAAGTAACTGAAAGAGAGAAACAAACATCAGTGCAGTGGAAGCACCCAAGGCTACACCTGAATGGTGGGAAGCTCTTTGCTGCTATATAAAATGAATCAGGCTCAGCTACTATTATT …………

Function?

Decoding genome function

Page 5: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

ACCAGTTACGACGG

TCAGGGTACTGATA

CCCCAAACCGTTGA

CCGCATTTACAGAC

GGGGTTTGGGTTTT

GCCCCACACAGGTA

CGTTAGCTACTGGT

TTAGCAATTTACCG

TTACAACGTTTACA

GGGTTACGGTTGGG

ATTTGAAAAAAAGT

TTGAGTTGGTTTTT

TCACGGTAGAACGT

ACCTTACAAA…………

One genome many cell types

http://www.roadmapepigenomics.org/

Page 6: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Biochemical markers of cell-type specific functional elements

Active geneRepressed gene

Protein

https://www.broadinstitute.org/news/1504

Control elements99 % Non-

coding

1.5 %

Protein Coding

Page 7: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

100s of Cell-Types/Tissues

10

0s

of

cell

typ

es a

nd

tis

sue

s

NIH funded collaborative consortia

Machine learning, Probabilistic models,

Deep learning

Identifying tissue-specific control

elements

Interpreting disease-associated genetic

variation

Learning sequence code of control

elements

Page 8: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Active control elementsActive control elementsActive genes

Repressed elements

• ~20,000 genes

• ~2 million novel putative control elements!

• cell-type specific activity

A comprehensive functional

annotation of the human genome

Page 9: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

2M control elements show highly modular tissue-specific activity

• ~20,000 genes

• ~2 million novel putative control elements!

• modular tissue-specific activity!

2M control elements

10

0s

of

Tiss

ues

ActiveInactive

Page 10: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Decoding DNA words and grammars that specify tissue-specific control elements

Regulatory proteins bind DNA words (landing pads) in control

elements!

‘Motif Discovery’

Page 11: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Learning discriminative DNA words from tissue-specific control element sequences

Training

Input sequences

(X)

Classification function

F(X)

Class = +1

Class = +1

Class = +1

Class = -1

Class = -1

Class = -1

Training

Output labels

(Y)

‘Training’ means

learning the

function F(X) from

multiple input,

output pairs (X,Y)

sequences of control elements active in Tissue 1

sequences of control

elements NOT active in Tissue 1 but active in other tissues

Page 12: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

C G A T A A C C G A T A T

Learned pattern detectors

One-hot encoded input: DNA sequence represented as ones and zeros

Later layers build on patterns of previous layer

Binary Output: Active (1) vs Inactive (0)

Deep convolutional neural network (CNN) on DNA sequence inputs

ACGT

0100

0010

1000

0001

1000

1000

0100

0100

0010

1000

0001

1000

0001

Is seq. active

in cell type 1?

Page 13: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Deeper conv. layers learn DNA word

combinations (grammars)Score sequence using filters

Convolutional layersNeurons learn DNA word pattern detectors

Is seq. active

in cell type 1?

prediction accuracyMean auROC = 0.82Mean auPRC = 0.65

Is seq. active in

cell type 100?Is seq. active

in cell type 2?

Multi-task learning

Similar toKelley et al. 2016 (Basset)Zhou et al. 2015 (DeepSEA)

Multi-task deep CNNs learn discriminative DNA word pattern detectors

Millions of input sequences of control elements

Page 14: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

C G A T A A C C G A T A T

Is seq. active

in cell type 1?

Is seq. active

in cell type 2?

How can we identify important parts of the input sequences?

In-silico mutagenesis• inefficient• misleading results due to

saturation/buffering

A

?

G

T

A

C

T

C

G

T

…................................Alipanahi et al, 2015Zhou & Troyanskaya, 2015Kelley et al 2016

Page 15: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

C G A T A A C C G A T A T

Is seq. active

in cell type 1?

Is seq. active

in cell type 2?

Efficient “Backpropagation” based approaches

ACGT

0100

0010

1000

0001

1000

1000

0100

0100

0010

1000

0001

1000

0001

Is seq. active

in cell type 1?

G A T AC C G A A

Gradient based methods• Saliency maps (Simonyan 2013)• Deconv networks (Zeiler, Fergus 2013)• Guided backprop (Springerberg 2014)• Layerwise relevance propagation (Bach

2015)• Integrated gradients (Sundarajan 2016)

Avanti Shrikumar Peyton Greenside

DeepLIFTShrikumar et al. Learning Important Features

Through Propagating Activation Differences

https://arxiv.org/abs/1704.02685

CODE: https://github.com/kundajelab/deeplift

Page 16: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

DeepLIFT identifies combinatorial grammars of DNA wordsdefining tissue-specific control elements!

Shrikumar et al. https://arxiv.org/abs/1704.02685

CODE: https://github.com/kundajelab/deeplift

Page 17: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Distinct combinations of DNA words can active same control element in different tissues

Peyton Greenside

Control element sequence

SPI1

DeepLIFT scoresTissue: Blood stem cells

Position along sequence

Gata (Rc) Gata (Rc)GataSPI1

DeepLIFT scoresTissue: Red blood cells

SPI1 protein binding data

GATA1 protein binding data

Validation experiment results

Page 18: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Decoding tissue-specific combinatorial grammars in millions of genomic control elements!

Peyton Greenside

Page 19: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

MoDISCO: Identifying recurring DNA words across control elements

Insight: filter contributions are resolved at the nucleotide level

Sequence 1

Sequence 2

Sequence 3

Δp

rob

Δp

rob

Δp

rob

Avanti Shrikumar Peyton Greenside

Page 20: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

We learn 1000s of known and novel DNA words defining tissue-specific control elements!

Page 21: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Can deep CNNs trained on control elements be useful for understanding disease-associated genetic variants?

> 1000 population sequencing studies of diverse diseases

> 90% of complex disease-associated variants are not in genes. Highly enriched in control elements!

Page 22: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant tissue context

Original prediction:

0.558

0.528

0.554

0.969

0.960

0.889

Mutated prediction:

0.543

0.583

0.557

0.926

0.900

0.756

Difference (Percent):

-1.5%

+5.4%

0.3%

-4.3%

-5.9%

-13.2%

Breaking the ‘C’ results in significant drop in probability of active control

element!

Unstimulatedcoronary smooth muscle cells

• Breaks the ‘C’ in TGACTCA DNA word which is binding site for an important protein (AP1).

• Variant specifically manifests in stimulated cells

Stimulated coronary smooth muscle cells from patients

A genetic variant C -> T strongly associated with coronary heart disease

Page 23: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Future of personalized medicine

Personal genome sequences

Personal functional genomic data

Electronic medical records / Clinical data / biometrics

/ Literature mining

Lon

gitu

din

al d

ata

Domain-specific machine learning +

AI

Rapid interpretation of personal genomes

Data-driven personal diagnosis (cause rather

than symptoms)

Drug target identification and design

Optimal treatment regimens

Page 24: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

How to train your DRAGONNDeep RegulAtory GenOmic Neural Nets

http://kundajelab.github.io/dragonn/

Interactive Cloud based tutorials on deep learning on genomic sequence

Johnny Israeli

Page 25: AGCCAAGCAGCAAAGTTTTGCTGCTGTTTATTTTTGTAGCTCTTACTATATTCTACTTTTACCA ... · 2017. 5. 9. · Deep CNNs can predict and interpret effects of disease-associated genetic variants in relevant

Acknowledgements

25

Will Greenleaf

Chuan Sheng Foo

Kundaje Lab members

Johnny Israeli

R01ES02500902

U41-HG007000-04S1

U01HG007919-02 (GGR)

Avanti Shrikumar

PeytonGreenside

Funding

Conflict of Interest: Deep Genomics (SAB), Epinomics (SAB)

Chris Probert

Irene Kaplow