from molecular computing to molecular programming

Post on 02-Jan-2016

46 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

From Molecular Computing to Molecular Programming. 萩谷 昌己 Masami Hagiya. JSPS Project on Molecular Computing. Funded by Japan Society for Promotion of Science Research for the Future Program biocomputing field ( 生命情報 ) chaired by Prof. Anzai molecular computing - PowerPoint PPT Presentation

TRANSCRIPT

From Molecular Computingto Molecular Programming

萩谷 昌己Masami Hagiya

JSPS Project on Molecular Computing

• Funded by Japan Society for Promotion of Science

• Research for the Future Program– biocomputing field ( 生命情報 )

chaired by Prof. Anzai• molecular computing

• artificial cell (chemical IC)

• evolutionary computation

• signal transduction

• complex systems

• October 1996 - March 2001

• Project Leader - Masami Hagiya (Computer Science)

• Members– Takashi Yokomori (Computer Science)– Masayuki Yamamura (Computer Science)

– Masanori Arita (Genome Informatics)

– Akira Suyama (Biophysics)– Yuzuru Husimi (Biophysics)– Kensaku Sakamoto (Biochemistry)– Shigeyuki Yokoyama (Biochemistry)

JSPS Project on Molecular Computing

Goals of Molecular Computing• Analyses and Applications of Computational Power of Bio

molecules– Understanding Life from the Viewpoint of Computation

• computational mystery of life

• origin of life, wet artificial life

– Engineering Applications (not restricted to computation)

• combinatorial optimization

• gene expression analysis

• nanotechnology, nanomachine

• cryptography

• medical and pharmaceutical applications in the future

• New Computational Model, New Simulation Technology

Major Achievements• Suyama’s Dynamic Programming DNA Computers

– reduction of molecules by breadth-first search– automation by robots

• Sakamoto’s Hairpin Engines– Whiplash PCR and SAT Engine– molecular computation by hairpin formation– autonomous molecular computation

• Theoretical Studies by Yokomori’s Group• Nishikawa’s Simulator for DNA computations• Arita’s New Tool for Code Design• Husimi’s 3SR-Based Evolutionary Reactor• Yamamura’s Aqueous Computing (with Head)

Dynamic ProgrammingDNA Computers

Adleman-Lipton Paradigm• Adleman ( Science 1994 )

– Solving Hamilton Path Problem by DNA

• Lipton, et al.– Solving SAT Problem by DNA

• Massively Parallel Computation by Molecules– Mainly for Combinatorial Optimization– Random Generation by Self-Assembly

• solution candidate = DNA molecule

– Selection by Molecular Biology Experiments

Scaling Up ⇒ Efforts to increase yields and reduce errors

Robot and Chemical IC

cf. Hamiltonian Path Problem by Adleman

Dynamic ProgrammingDNA Computer

• “counting” ( Ogihara and Ray )• “dynamic programming” ( Suyama )• Iteration of Generation and Selection

– Generation of Candidates of Partial Solutions– Selection of Solutions

• The order of computational complexity does not decrease, but the amount of necessary molecules is drastically reduced.– 3-SAT

Brute Force v.s. Dynamic ProgrammingDNA Computers

Solution

GeneratingWHOLE

solution spaceAT ONCE

GeneratingWHOLE

solution spaceAT ONCE

SelectionSelection

Brute ForceLarge pool size

Low reaction rate

Solution

GeneratingPARTIAL

solution spaceSTEP BY STEP

GeneratingPARTIAL

solution spaceSTEP BY STEP

SelectionSelection

Dynamic Programmingsmall pool size

High reaction rate

3-CNF SAT Solution on DP DNA Computer

}{

YES

)()(

)()(

)()(

)()(

)()(

clauses10variables,4

4321

432432

432431

421431

321321

321321

FFTT XXXX

xxxxxx

xxxxxx

xxxxxx

xxxxxx

xxxxxx

:Solution

:Problem

Basic Operations forDynamic Programming DNA Computers

get (T, +s), get (T, -s)get DNA molecules with a subsequence s (without s) in a tube T

append (T, s, e)append a subsequence s at the end of DNA molecules with a splint e in a tube T

merge (T, T1, T2, …, Tn)merge DNA molecules in tubes T1, T2, …, Tn into a tube T

amplify(T, T1, T2, …, Tn)amplify DNA molecules in a tube T and divide them into tubes T1, T2, …, and Tn

detect(T)detect DNA molecule in a tube T

Implementation of Basic Operations

annealingand

ligation s

s

immobilizationand

cold wash

s

s

hot wash

s

Taq DNA ligase

get (T, +s), get (T, -s)

s

s

annealing

immobilization

cold wash

hot washs get (T, +s)

get (T, -s)

s

s

amplify (T, T1, T2, …Tn)

PCR

immobilizationand

cold wash

hot washand

divide

annealing T

T1, T2, …Tn

append (T, s, e)

e

e

DP algorithm for 3CNF-SAT on DNA Computers

end

return

end

end

end

thenif

end

thenif

dotofor

dotofor

begin

function

);(detect

);,merge(

);,,(append);,,(append

);,,(getuvsat

);,,(getuvsat

1

);,,(amplify

3

};,,,{

),,,...,,,(sat3dna

/1

/1

1

212121212

111

n

FTk

Fk

FTk

Fk

Fw

FTk

FTk

Tk

Tw

T

jjT

wT

w

kj

jjF

wF

w

kj

Fw

Twk

FFFTTFTT

mmm

T

TTT

XXXTTXXXTT

vuTT

xw

vuTT

xw

mj

TTT

nk

XXXXXXXXT

wvuwvu

end

return

begin

function

;

);,(merge

);,(get

/*/*);,'(get

);,(get');,(get

),,(getuvsat

T

Tv

Tu

T

Tv

Fu

Tv

Fu

Fu

Fu

Tu

Fu

Tu

Tu

T

TTT

XTT

omittedbecanXTT

XTTXTT

vuT

merge)get3(

merge)

append2(amplify)2(

operationsofNumber

m

n

DP algorithm for 3CNF-SAT

)( 432 xxx

)( 432 xxx

)( 432 xxx

k’s loop: k ranges over variable indices j’s loop: j ranges over clause indices if xk is the 3rd literal of the j-th clause then remove those assignments which satisfy neither the 1st nor the 2nd literal append Xk

F to the remaining assignments (do similarly if ¬ xk is the 3rd literal)

X1F X2

T X3T

X1F X2

F X3T

X1T X2

T X3F

X1T X2

F X3F

k = 4

X1T X2

T X3F X4

F

Confirmation of the Solution by PCR (k=4)

}{),,(append 4/

34 TFTTTw

T XXXTT }{),,(append 43214/

34FFTTFFTFF

wF XXXXXXXTT

},,,{ 321321321321FFTFTTTFFTTFF

wT

w XXXXXXXXXXXXTT

merge

)}get();get();{get();get(

),,(getuvsat

)(:7

merge

)}get();get();{get();get(

),,(getuvsat

)(:6

merge

)}get();get();{get();get(

),,(getuvsat

)(:5

3111

77

431

2111

66

421

3111

55

431

TTFF

Fw

TTFF

Fw

FFTT

Tw

XXXX

vuT

xxxj

XXXX

vuT

xxxj

XXXX

vuT

xxxj

merge

)}get();get();{get();get(

),,(getuvsat

)(:10

merge

)}get();get();{get();get(

),,(getuvsat

)(:9

merge

)}get();get();{get();get(

),,(getuvsat

)(:8

3222

1010

432

3222

99

432

3222

88

432

FTFF

Fw

FTTT

Fw

TFTT

Fw

XXXX

vuT

xxxj

XXXX

vuT

xxxj

XXXX

vuT

xxxj

dotofor 101j

end

24 26 28 30 32 34

elution time (min)

RF

U

M SM

no S

24 26 28 30 32 34

elution time (min)

RF

U FT

),( 32FT XX

),( 32FF XX

),( 31FT XX

),( 31TT XX

),( 31FF XX

),( 31TF XX

26 27 28 29 30

elution time (min)

RF

U

An Amount of DNA for ComputationBrute Force v.s. Dynamic Programming

100 variable 3-CNF SAT

Adleman-Lipton’s

Brute Force

DNA Computers

2x1012 g of dsDNA

1x1012 g of ssDNA

Dynamic Programming

DNA Computers

2x10-3 g of dsDNA

(1x10-3 g of ssDNA)

4×1016

1×1015

4×1013

n = 100

Number of Variables v.s.Number of Molecules

On Scaling Up the Size of Computations

• The size of random pools currently used:– 210 … 310

Rapidly increasing.

• The number of molecules in a test tube:– 1010 … 1012

We will soon reach the limit on the number of molecules in a test tube. That is, we will fully utilize the parallelism of molecules in a single tube.

⇒ Multiple Tubes, Chemical IC, Cells, etc.… But reaching the limit is the current goal.

Robot for DNA Computing Based on MAGTRATIONTM

Magnetic Beads in MAGTRATIONTM

Automation of DNA Computations• Robot for DNA Computing Based on

MAGTRATIONTM

s

s

Annealing

Immobilization

Cold wash

Hot washs get (T, +s)

get (T, -s)

s

s

Automation of the Get Command

[Instrument][Reset Counter] 0[Home Position] 0[MJ-Open Lid]・・・[Get1(0)][Get2(1)][Append(2)]・・・[Exit]

protocol-level

(1-1-4) [MJ-Open Lid]Do 2 _SEND "LID OPEN" Do 10 _SEND "LID?" Wait_msec 500 _CMP_GSTR "OPEN" IF_Goto EQ 0 ;open Wait_msec 1000 LoopLoop; Time outEnd;open

script-level

end

return

end

end

end

thenif

end

thenif

dotofor

dotofor

begin

function

);(detect

);,merge(

);,,(append

);,,(append

);,,(getuvsat

);,,(getuvsat

1

);,,(amplify

3

};,,,{

),,,...,,,(sat3dna

/1

/1

1

212121212

111

n

FTk

Fk

FTk

Fk

Fw

F

Tk

FTk

Tk

Tw

T

jjT

wT

w

kj

jjF

wF

w

kj

Fw

Twk

FFFTTFTT

mmm

T

TTT

XXXTT

XXXTT

vuTT

xw

vuTT

xw

mj

TTT

nk

XXXXXXXXT

wvuwvu

Pascal/C-level

Programming in DNA Computer

Hairpin Engines

Autonomous Molecular Computing• Adleman-Lipton Paradigm

– Generation of Candidates = Autonomous Reaction– Selection of Solutions = Operations from Outside

• One-Pot Reaction ⇒ Autonomous Computation

Comutation by Successive Autonomous Reactions by Molecules– Winfree’s DNA Tile– Sakamoto’s Hairpin Engines

• Whiplash PCR and SAT Engine

cf. Winfree’s DNA Tile

cf. Winfree’s DNA Tile

Hairpin Engines

• Molecular Computation by Hairpin Formation– Hairpin --- Typical Secondary Structure

• Whiplash PCR– DNA Automaton: State Machine by DNA

– 5 Transitions in a Control Experiment

• SAT Engine– Selection by Hairpin Structures of DNA

– 3‐SAT: 6-Variable 10-Clause Formula

SAT Engine• Sakamoto et al., Science, May 19, 2000.• Selection by Hairpin Structures of DNA

– digestion by restriction enzyme– exclusive PCR

• 3-SATssDNA consisting of literals,

each selected from a clausecomplementary literal = complementary sequencedetection of inconsistency hairpin⇒

• The essential part of the SAT computation is done by hairpin formation.– Autonomous Molecular Computation

b ¬ be

(a∨b∨c)∧( ¬ d∨e∨¬ f)∧ … ∧( ¬ c∨¬ b∨a)∧ ...

b ¬ bdigestion by restriction enzymeexclusive PCR

Selection by Hairpin Structures• Digestion by Restriction Enzyme

– Hairpins are cut at the restriction site inserted in each literal sequence.

• Exclusive PCR– PCR is inefficient for hairpins.– In exclusive PCR, solution is diluted in each

cycle to keep the difference in amplification.• The number of steps is independent on the number

of variables or clauses.

Generation of Random Pool

(a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l)

a d g j

b e h k

c f i l

Chemically Synthesized

Generation of Random Pool

(a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l)

a d g j

b e h k

c f i l

Generation of Random Pool

(a∨b∨c)∧(d∨e∨f)∧(g∨h∨i)∧(j∨k∨l)

a

d

g

j

b

e

h

kc

f

i l

4 5 5 54 4 4 4 4 49 8

BstXI BstXIBstNI BstNI BstNI

30

Generation of Random Pool

4

6-Variable 10-Clause Formula

(a∨b∨!c)∧(a∨c∨d)∧(a∨!c∨!d)∧(!a∨!c∨d)∧(a∨!c∨e)∧(a∨d∨!f)∧(!a∨c∨d)∧(a∨c∨!d)∧(!a∨!c∨!d)∧(!a∨c∨!d)

! = ¬

Solution of a6-Variable 10-Clause formula

Whiplash PCR• DNA Automaton : State Machine by DNA

– Polymerization of a Hairpin Form– Polymerization Stop

• Autonomous SIMD Computation of Boolean μ-formulas

• Solving NP-Complete Problems in O(1)-Stepe.g., vertex cover:

vertex cover candidate = transition table = ssDNA

vertex cover = transition table that reaches the final state

• 5 Transitions in a Control Experiment

x B A xC

Bx

ab

Whiplash PCR

x B A xC

B

Whiplash PCR

x B A x C B x

a

Whiplash PCR

x B A x C B x

a

bc

Whiplash PCR

5 Transitions ina Control Experiment

0 12

34

56

7

A Perspective on Molecular Computing

• Structure Formation = Computation– probabilistic process– governed by thermodynamics and kinetics

• Molecular Computation as– Probabilistic (Randomized) Computation– It should be analyzed by

• complexity theory (for random algorithms)• thermodynamics and kinetics.• cf. Winfree

• Computational Mystery of Life– Why is life so efficient computationally?

• protein folding• gene regulation, signal transduction, etc.

Molecular Programming

Molecular Programming• Designing and controlling biomolecular reactions• Biomolecules (DNA,RNA,protein) --- combinatorial complexity

• Molecular program --- two parts– the part encoded in molecules themselves (e.g., DNA seq.)

• We should go beyond simple code design.– the part implemented by a sequence of lab. operations

• Molecular programming ...– controls conformational change and self-assembly by coordin

ating the two parts.

• Various applications– gene expression analysis– nanotechnology and nanomachine– combinatorial chemistry

Simple Example of Molecular Programming

• PCR (Polymerase Chain Reaction)– primers– various parameters

• high temperature and period• low temperature and period

– polymerase (enzyme)

• Molecular programming in PCR ...– Designing primer sequences– Setting various parameters– Selecting polymerase

More Sophisticated Example

• Suyama’s Universal DNA Chip– ENCODE : conversion from mRNA transcripts t

o DCN (DNA-coded number)– AMPLIFY : amplification of DCN by PCR– DECODE : detection of DCN by the universal D

NA chip (DNA capillary array)

AMPLIFYSD

ED

DECODE

D1 D2

D2kD1j

(j =0,1,…,n-1)

labelP

universal DNA chip

D1j

(j =0,1,…,n-1)

siei

biotin

DCNi

Ai

target transcript i

SA magnetic beads

ED

SDD1i

D2i

ED

SDD1i

D2i

ENCODE

More Sophisticated Example• DNA Chip and Molecular Programming

– ENCODE– AMPLIFY

– ANALYSIS : information processing on DCN• Example: If G1 is expressed, G2 is not expressed, and

G3 is expressed, then there is a danger of disease D1 and no danger of D2

– DECODE

• Such rules (programs) can be represented by molecules!– Whiplash PCR or Sakakibara’s recent work

• Merits– No need for sequencing --- efficiency and confidentiality

Plan for the Next Proposal

• Being submitted to Ministry of Education

• Has not got through.

• 4 sub-proposals– Theory of Molecular Programming– Molecular Programming for Self-assembly– Molecular Programming by Evolution– Molecular Programming in Chemical IC

top related