from neurons to neural networks

56
From Neurons to Neural Networks Jeff Knisley East Tennessee State University Mathematics of Molecular and Cellular Biology Seminar Institute for Mathematics and its Applications, April 2, 2008

Upload: howell

Post on 11-Jan-2016

36 views

Category:

Documents


3 download

DESCRIPTION

From Neurons to Neural Networks. Jeff Knisley East Tennessee State University Mathematics of Molecular and Cellular Biology Seminar Institute for Mathematics and its Applications, April 2, 2008. Outline of the Talk. Brief Description of the Neuron A “Hot-Spot” Dendritic Model - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: From Neurons to Neural Networks

From Neurons to Neural Networks

Jeff KnisleyEast Tennessee State University

Mathematics of Molecular and Cellular Biology Seminar

Institute for Mathematics and its Applications, April 2, 2008

Page 2: From Neurons to Neural Networks

Outline of the Talk

Brief Description of the Neuron A “Hot-Spot” Dendritic Model

Classical Hodgkin-Huxley (HH) Model A Recent Approach to HH Nonlinearity

Artificial Neural Nets (ANN’s) 1957 – 1969: Perceptron Models 1980’s – soon: MLP’s and Others 1990’s – : Neuromimetic (Spiking) Neurons

Page 3: From Neurons to Neural Networks

Components of a Neuron

Dendrites

Soma

nucleus

Axon

Myelin Sheaths

Synaptic Terminals

Page 4: From Neurons to Neural Networks

Pre-Synaptic to Post-Synaptic

If threshold exceeded,then neuron “fires,” sending a signal along its axon.

Page 5: From Neurons to Neural Networks

Signal Propagation along Axon

Signal is electrical Membrane depolarization from resting -70 mV Myelin acts as an insulator

Propagation is electro-chemical Sodium channels open at breaks in myelin

Much higher external Sodium ion concentrations Potassium ions “work against” sodium Chloride, other influences also very important

Rapid depolarization at these breaks Signal travels faster than if only electrical

Page 6: From Neurons to Neural Networks

Signal Propagation along Axon

+++- - - +++- - -

+++- - - +++- - - +++

- - -

- - -+++

reversal

- - -+++

reversal- - -+++

reversal- - -+++

reversal

Page 7: From Neurons to Neural Networks

Action Potentials

Sodium ion channels open and close

Which causes Potassium ion channels to open and close

Page 8: From Neurons to Neural Networks

Action Potentials

Model “Spike”

Actual Spike Train

Page 9: From Neurons to Neural Networks

Post-Synaptic may be SubThreshold

Signals Decay at Soma if below a Certain threshold

Models beginwith section of a dendrite.

Page 10: From Neurons to Neural Networks

Derivation of the Model

Some Assumptions Assume Neuron separates R3 into 3 regions—interior

(i), exterior (e), and boundary membrane surface (m) Assume El is electric field and Bl is magnetic flux

density, where l = e, i Maxwell’s Equations:

Assume magnetic induction is negligible

Ee = – Ve and Ei = – Vi for potentials Vl , l = i,e

0ll t

B

E

Page 11: From Neurons to Neural Networks

Current Densities ji and je

Letl = conductivity 2-tensor, l = i, e Intracellular homogeneous; small radius Extracellular: Ion Populations!

Ohm’s Law (local): lll Ej

0 ij 02 iV

L0

ji

je

Charges (ions) collecton outside of boundarysurface (especially Na+)

where Im = membranecurrents. Thus,

me I j+ + + +

mee IV

Page 12: From Neurons to Neural Networks

Assume: Circular Cross-sections

Let V = Vi – Ve – Vrest be membrane potential difference, and let Rm, Ri , C be the membrane resistance, intracellular resistance, membrane capacitance, respectively. Let Isyn be a “catch all” for ion channel activity.

ionm

m It

VC

R

VI

d

4 syni m

d V V VC I

x R x x R x t

Lord Kelvin:

Cable EquationIion

Page 13: From Neurons to Neural Networks

Dimensionless Cables

2

2 m m syn

V VV R I

X t

Let and let and m= RmC constant4

m

i

R dx X

R

Tapered Cylinders: Z instead of X and a taper constant K.

2

2 m m syn

V V VK V R I

Z Z t

Iion

Iion

Page 14: From Neurons to Neural Networks

Rall’s Theorem for Untapered

If at each branching the parent

diameter and the daughter cylinder

diameters satisfy

then the dendritic tree can be reduced

to a single equivalent cylinder.

parentdaughters

3/ 2 3/ 2parent j

j daughters

d d

Equivalent Cylinder

Page 15: From Neurons to Neural Networks

Dendritic Models

Full Arbor ModelTapered Equivalent Cylinder

Soma

Page 16: From Neurons to Neural Networks

Tapered Equivalent Cylinder

Rall’s theorem (modified for taper) allows us to collapse to an equivalent cylinder

Assume “hot spots” at x0, x1, …, xm

Soma

0 x0 x1 . . . xm l

. . .

Page 17: From Neurons to Neural Networks

Ion Channel Hot Spots

(Poznanski) Ij due to ion channel(s) at the jth hot spot

Green’s function G(x, xj, t) is solution to hot spot equation for Ij as a point source and others = 0 Plus boundary conditions and Initial conditions Green is solution to Equivalent Cylinder model

2

214

nm

m j jji

R d V VR C V I t x x

R x t

Page 18: From Neurons to Neural Networks

Equivalent Cylinder Model (Iion = 0)

2

20

, 0 ( )

tanh 0,0, 0,

,0 . .

m

s

V VV

X tV

L t no current through endX

L V tVt V t

X t

V x Steady State from const curr

Soma: V ( 0, t ) = Vclamp (voltage clamp)

For Tapered Equivalent Cylinder Model, equation is of the form

02

2

Vt

V

Z

VZF

Z

Vm

Page 19: From Neurons to Neural Networks

Properties

Spectrum is solely non-negative eigenvalues Eigenvectors are orthogonal in Voltage Clamp Eigenvectors are not orthogonal in original

Solutions are multi-exponential decays

Linear Models useful for subthreshold activation assuming nonlinearities (Iion) are not arbitrarily close to soma (and no electric field (ephaptic) effects)

1

/,k

tk

keXCtXV

Page 20: From Neurons to Neural Networks

Somatic Voltage Recording

Saturate to Steady State

Experimental ArtifactMultiExponential Decay

Ionic Channel Effects

0 10ms

Page 21: From Neurons to Neural Networks

Hodgkin-Huxley: Ionic Currents

1963 Nobel Prize in Medicine Cable Equation plus Ionic Currents (Isyn) From Numerous Voltage Clamp Experiments

with squid giant axon (0.5-1.0 mm in diameter) Produces Action Potentials

Ionic Channels n = potassium activation variable m = sodium activation variable h = sodium inactivation variable

Page 22: From Neurons to Neural Networks

Hodgkin-Huxley Equations

24 3

24

1 , 1 ,

1

m l K NaK Nai

n n m m

h h

d V VC g V V g n V V g m h V V

R x t

n mn n m m

t th

h ht

where any V with subscript is constant, any g with a bar is constant, and each of the ’s and ’s are of similar form:

/80

10 /10

10 1,

8100 1

Vn nV

VV V e

e

∙ (x-x j)

Page 23: From Neurons to Neural Networks

HH combined with “Hot Spots”

The solution to the equiv cylinder with hotspots is

where Ij is the restriction of V to jth “hot spot”. At a hot-spot, V satisfies ODE of the form

where m, n, and h are functions of V.

0

0

, , ,n t

initial j jj

V x t V G x x t I d

4 3m l K NaK Na

VC g V V g n V V g m h V V

t

Page 24: From Neurons to Neural Networks

Brief description of an Approach to HH ion channel nonlinearities

Goal: Accessible Approximations that still produce action potentials.

Can be addressed using Linear Embedding, which is closely related to the method of Turning Variables. Maps an finite degree polynomially nonlinear dynamical

system into an infinite degree linear system. The result is an infinite dimensional linear system which is

as unmanageable as the original nonlinear equation. Non-normal operators with continua of eigenvalues Difficult to project back to nonlinear system (convergence

and stability are thorny) But still the approach has some value (action potentials).

Page 25: From Neurons to Neural Networks

The Hot-Spot Model “Qualitatively”

0

0

0, 0, ,n t

j jj

V t G x t I d

Inputs fromOther Neurons

and ion channels

Key Features: Summation of Synaptic Inputs. If V(0,t) is large, action potential travels down axon.

From Subthreshold (Rall Eq. Cyl or

Full Arbor)

Page 26: From Neurons to Neural Networks

Artificial Neural Network (ANN)

Made of artificial neurons, each of which Sums inputs xi from other neurons Compares sum to threshold Sends signal to other neurons if above

threshold Synapses have weights

Model relative ion collections Model efficacy (strength) of synapse

Page 27: From Neurons to Neural Networks

Artificial Neuron

th thi jw synaptic weight betweeni and j neuron

" "firing function that maps state to output

i i ix s i ij js w x

1x2x3x

nx

1iw2iw

3iw

inw

..

.

thj threshold of j neuron

Nonlinear firing function

Page 28: From Neurons to Neural Networks

First Generation: 1957 - 1969

Best Understood in terms of Classifiers Partition a data space into regions containing

data points of the same classification. The regions are predictions of the

classification of new data points.

Page 29: From Neurons to Neural Networks

Simple Perceptron Model

Given 2 classes – Reference and Sample

Firing function (activation function) has only two values, 0 or 1.

“Learning” is by incremental updating of weights using a linear learning rule

referencefromif

samplefromifOutput

0

1w1

w2

wn

Page 30: From Neurons to Neural Networks

Perceptron Limitations

Cannot Do XOR (1969, Minsky and Papert) Data must be linearly separable

1970’s: ANN’s “Wilderness Experience” – only a handful working and very “un-neuron-like”

Page 31: From Neurons to Neural Networks

Support Vector Machine: Perceptron on a Feature Space Data is projected into a high-dimensional

Feature Space, separated with a hyperplane Choice of Feature Space (kernel) is key. Predictions based on location of hyperplane

Page 32: From Neurons to Neural Networks

Second Generation: 1981 - Soon

Big Ideas from other Fields J. J. Hopfield compares neural networks to

Ising Spin Glass models. Uses statistical Mechanics to prove that ANN’s minimize a total energy functional.

Cognitive Psychology provides new insights into how neural networks learn.

Big Ideas from Math Kolmogorov’s Theorem

AND

Page 33: From Neurons to Neural Networks

Firing Functions are Sigmoidal

j

j

j

1

1 j j jj j s

se

Page 34: From Neurons to Neural Networks

3 Layer Neural Network

Output

Hidden(is usually much larger)

Input

The output layer mayconsist of a single neuron

Page 35: From Neurons to Neural Networks

Multilayer Network

1 1 1t w x

1x

2x3x

nx

..

....

tN N N w x

12

N1

N

j jj

out

1

Nt

j j jj

out

w x

Page 36: From Neurons to Neural Networks

Hilbert’s Thirteenth Problem

Original: “Are there continuous functions of 3 variables that are not representable by a superposition of composition of functions of 2 variables?”

Modern: Can a continuous function of n variables on a bounded domain of n-space be written as sums of compositions of functions of 1 variable?

Page 37: From Neurons to Neural Networks

Kolmogorov’s Theorem

Modified Version: Any continuous function f

of n variables can be written

where only h and w’s depend on f

(That is, the g’s are fixed)

2 1

11 1

, ,n n

n ij j ij i

f s s h g s

Page 38: From Neurons to Neural Networks

Cybenko (1989)

Let be any continuous sigmoidal function,

and let x = (x1,…,xn). If f is absolutely integrable

over the n-dimensional unit cube, then for all >0,

there exists a (possibly very large ) integer N and

vectors w1,…,wN such that

where 1,…,N and 1,…,N are fixed parameters.

1

NT

j j jj

f

x w x

Page 39: From Neurons to Neural Networks

Multilayer Network (MLP’s)

1 1 1t w x

1x

2x3x

nx

..

....

tN N N w x

12

N1

N

j jj

out

1

Nt

j j jj

out

w x

Page 40: From Neurons to Neural Networks

ANN as a Universal Classifier

Designs a function f : Data -> Classes Example: f ( Red ) = 1, f ( Blue) = 0 Support of f defines the regions

Data is used to train (i.e., design ) function fsupp(f)

Page 41: From Neurons to Neural Networks

Example – Predicting Trees that are or are not RNA-like

D d-t d-a d-L d-D Lamb-2 E-ratio Randics

0.333333 0.666667 0.666667 0.5 0.666667 0.2679 0.8 2.914214

0.333333 0.5 0.5 0.5 0.666667 0.3249 1 2.770056

0.5 0.5 0.5 0.5 0.5 0.382 1 2.80806

0.166667 0.333333 0.5 0.833333 0.833333 1 2 2.236068

0.333333 0.333333 0.333333 0.666667 0.666667 0.4384 1.2 2.642734

0.333333 0.333333 0.333333 0.666667 0.666667 0.4859 1.4 2.56066

RNALike

NotRNALike

Construct Graphical Invariants Train ANN using known RNA-trees Predict the others

Page 42: From Neurons to Neural Networks

2nd Generation: Phenomenal Success

Data Mining of Micro-array data Stock and commodities trading: ANN’s are an

important part of “computerized trading” Post office mail sorting

This tiny 3-Dimensional Artificial Neural Network, modeled after neural networks in the human brain, is helping machines better visualize their surroundings.

Page 43: From Neurons to Neural Networks

The Mars Rovers ANN decides between “rough” and “smooth”

“rough” and “smooth” are ambiguous

Learningvia many“examples”

And a neural network can lose up to 10% of its neurons without significant loss in performance!

Page 44: From Neurons to Neural Networks

ANN Limitations

Overfitting: e.g, if Training Set is “unbalanced”

Mislabeled data can lead to slow (or no) convergence or incorrect results.

Hard Margins: No “fuzzing” of the boundary

Overfitting may

ProduceIsolatedRegions

Page 45: From Neurons to Neural Networks

Problems on the Horizon

Limitations are becoming very limiting Trained networks often are poor learners (and

self-learners are hard to train) In real neural networks, more neurons imply

better networks (not so in ANNs ). Temporal data is problematic – ANN’s have no

concept or a poor concept of time “Hybridized ANN’s” becoming the rule

SVM’s probably the tool of choice at present SOFM’s, Fuzzy ANN’s, Connectionism

Page 46: From Neurons to Neural Networks

Third Generation: 1997 -

Back to Bio: Spiking Neural Networks (SNN) Asynchronous, action-potential driven ANN’s

have been around for some time. SNN’s show “promise” but results beyond

current ANN’s have been elusive Simulating actual HH equations (neuromimetic)

has to date not been enough Time is both a promise and a curse

A Possible Approach: Use current dendritic models to modify existing ANN’s.

Page 47: From Neurons to Neural Networks

ANN’s with Multiple Time Scales

SNN that reduces to ANN & preserves Kolmogorov Thm The solution to the equiv cylinder with hotspots is

where Ij is the restriction of V to jth “hot spot”.

Equivalent Artificial Neuron:

0

0

0, 0, ,n t

initial j jj

V t V G x t I d

ij

t

jji dxtts 0

Page 48: From Neurons to Neural Networks

Incorporating MultiExponentials

G (0,x,t) is often a multi-exponential decay. In terms of time constants k

wjk are synaptic “weights” k from electrotonic and morphometric data

Rate of taper, Length of dendrites Branching, capacitance, resistance

n

j

t

jut

jkk

duxeews kk

10

//

1

Page 49: From Neurons to Neural Networks

Approximation and Simplification

If xj(u) approx 1 or xj(u) approx 0, then

A Special Case (k is a constant)

t = 0 yields the standard Neural Net Model Standard Neural Net as initial Steady State Modify with time-dependent transient

txews j

n

j

tkjk

k

k

1

/

1

1

j

n

j

ktjj xepws

1

1

Page 50: From Neurons to Neural Networks

Artificial Neuron

" "firing function that maps state to output

i i ix s

1x2x3x

nx

..

.

thj threshold of j neuron

Nonlinear firing function

j

n

j

ktijiji xepws

1

1

wij, pij = synaptic weights

wi1, pi1

win, pin

Page 51: From Neurons to Neural Networks

Steady State and Transient

Sensitivity and Soft Margins t = 0 is a perceptron with weights wij

t = ∞ is a perceptron with weights wij + pij For all t in (0, ∞), a traditional ANN with weights

between wij and wij + pij Transient is a perturbation scheme Many predictions over time (soft margins)

Algorithm Partition training set into subsets Train at t=0 for initial subset Train at t > 0 values for other subsets

Page 52: From Neurons to Neural Networks

Training the Network

Define an energy function

vectors are the information to be “learned” Neural networks minimize energy The “information” in the network is

equivalent to the minima of the total squared energy function

n

iiiiE

1

2

2

1

Page 53: From Neurons to Neural Networks

Back Propagation

Minimize Energy Choose wj and j so that In practice, this is hard

Back Propagation with cont. sigmoidal Feed Forward, Calculate E, modify weights

Repeat until E is sufficiently close to 0

0, 0ij j

E E

w

n

jjjkjjjj

newj

newj

jjjjjjjnewj

newj

xww

yyy

1

1,

Page 54: From Neurons to Neural Networks

Back Propagation with Transient

Train Network Initially (choose wj and j) Each “synapse” given a transient weight pij

Algorithm Addressing Over-fitting/Sensitivity Weights must be given random initial values Weights pij also given random initial values Separate Training of wj and j and pij ameliorates

over-fitting during the training sequence

n

jjjkjjj

ktj

newjoutput

newjhidden

ktjj

newjoutput

newjoutput

expp

epp

1,,

,,

)1(

),1(

Page 55: From Neurons to Neural Networks

Observations/Results

Spiking does occur But only if network is properly “initiated” Spikes only resemble Action Potentials

This is one approach to SNN’s Not likely to be the final word Other real neuron features may be necessary

(e.g., tapering axons can limit frequency of action potentials: also—branching! )

This approach does show promise in handling temporal information

Page 56: From Neurons to Neural Networks

Any Questions?

Thank you!