the role of machine learning in modelling the cell

17
The Role of Machine Learning in Modelling the Cell. John Hawkins ARC Centre for Complex Systems University of Queensland Australia

Upload: butest

Post on 19-Jun-2015

192 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The role of machine learning in modelling the cell

The Role of Machine Learning in Modelling the Cell.

John Hawkins

ARC Centre for Complex Systems

University of Queensland

Australia

Page 2: The role of machine learning in modelling the cell

Overview of Talk

Overview of cell biology Modelling the cell Subcellular localisation signals Machine Learning in General Neural networks

Feed Forward versus Recurrent

Page 3: The role of machine learning in modelling the cell

Cell Biology – Quick and Dirty

Membrane bound Organelles

Nucleus DNA -> RNA ->

Protein Transport, e.g.

Mitochondria Peroxisome

Modification, e.g. Disulphide

Bond Formation Glycosylation

Page 4: The role of machine learning in modelling the cell

Cell Feedback

At a particular time point a set of genes will be expressed.

These do not remain constant, instead the emerging picture is that There is some essential cycle of gene

expression With a capacity to indulge in alternative

pathways of expression under external stimulus.

The pattern of expression is implemented by protein and RNA feedback onto the genes.

Page 5: The role of machine learning in modelling the cell

Modelling the cell Ideally we would like to model the cell from

the level of a 3D physical simulation. Currently this is infeasible

So numerous approaches are taken to form abstractions Gene Regulatory Networks Differential equation models of particular

pathways Machine learning models of particular

processes

Page 6: The role of machine learning in modelling the cell

Biological Sequences

Many Important Biological Molecules are Polymers. Thus representable as a sequence of discrete

symbols. Sequence M = [m1, m2, …, mn] where: DNA mi { A, T, G, C } RNA mi { A, U, G, C } Protein mi { G, A, V, L, I, P, S, C, T, M, D,

E, H, K, R, N, Q, F, Y, W }

Page 7: The role of machine learning in modelling the cell

Information Content

How much information in a linear sequence? Two crucial elements to function

Physical/chemical properties Molecular shape

Each residue has well known properties Denaturation. (Anfinsen,1973).

Sequence defines arrangement of chemical properties which in turn defines folding.

Page 8: The role of machine learning in modelling the cell

Biological Patterns

Motifs – General term for patterns Numerous Definitions & Visualisations

PROSITE Patterns – Regular Expression PROSITE Profiles – Probability Matrix LOGOs

Page 9: The role of machine learning in modelling the cell

Peroxisomal Localisation

Predominantly controlled by a C-terminal sequence called the PTS1 signal.

Roughly 12 residues long Known dependencies between

locations

Page 10: The role of machine learning in modelling the cell

Nuclear Export Some proteins move continuously between the

nucleus and cytoplasm of the cell. Either as:

Transporters Regulators

Page 11: The role of machine learning in modelling the cell

Machine Learning Requires a set of examples, with

Raw input, sequences data, and Known classes that the machine should

predict In essence Function Approximation

Start with a General parametrised function over the input data

Adjust the parameters until the output of the function is a good approximation to the known classes of the examples.

Page 12: The role of machine learning in modelling the cell

Bias

Bias is generally unavoidable (Mitchell, 1980)

Three Sources of Bias Input Encoding Function Structure (Architecture) Parameter adjustment algorithm (learning)

Page 13: The role of machine learning in modelling the cell

Neural Networks

Graphical Model consisting of layers of nodes connected by weights

Feed forward neural networks Fixed input window Signal propagates in a single pass through the

layers Recurrent Neural Networks

Signal processed in parts Recurrent connections maintain a memory state Output generated after processing the last piece

of the input signal

Page 14: The role of machine learning in modelling the cell

Simple Neural Networks

F F N N O h = S (W1 ∙ I1 + W2 ∙ I2 + b)

R N N O h = S (W1 ∙ I2 + W2 ∙ S (W1 ∙ I1 + b ) + b )

Page 15: The role of machine learning in modelling the cell

RNNs in Bioinformatics

Bi-Directional RNN

Page 16: The role of machine learning in modelling the cell

Applications

We have applied these techniques to Subcellular Localisation to

Endoplasmic Reticulum Mitochondria Chloroplast Peroxisome

http://pprowler.imb.uq.edu.au Working with whole genome data and wet

lab biologists to use these tools for data mining.

Page 17: The role of machine learning in modelling the cell

The End…

?