dna self-assembly the complexity of self-assembled shapes

DNA Self-Assembly

The complexity of

self-assembled shapes

Self-Assembly

“The process by which an organized structure can spontaneously form from simple parts”

Tile Assembly Model2-D self-assembly of square units called tiles.

Promising applicationsNanofabrication

Self-Assembly & Computation

Self-assembled “shape” = Output of computational process.

We are interested in shape complexity (defined later on):Kolmogorov ComplexityTile Complexity

Tile Assembly Model

Grid of unit square locations

i Z

j Z

(i, j)

Directions: d D = {N,E,S,W}

i Z

j Z

N

S

EW

N (i, j) = (i, j+1)

W (i, j) = (i-1, j)

S (i, j) = (i, j-1)

E (i, j) = (i+1, j)

N = S-1, S = N-1

W = E-1, E = W-1

Bond types: σ Σ

A bond type describes a ‘side’ of a tile (in terms of interaction with adjacent).

Special bond type for no interaction:

σ1 : σ2 : σ3 :

null:

Tile types: t = (σN, σΕ, σS, σW) Σ4

Defined by its four bonds

Special tile:

t1 = (σ1, null, σ1, null) =

t2 = (null, null, σ2, σ3) =

empty = (null, null, null, null)

t T

Tile (instance) t = (t, (i, j)) TZ2

A tile t is defined by its type t and its position (i, j) in the grid.

jExample:

t = (t2, (2, 2))

u = (t2, (3, 1))

v = (t1, (1, 1))

t

uv

i

Helper functions

Let t = (t, (i, j)) = ((σN, σE, σS, σW), (i, j))

type(t) = t pos(t) = (i, j) bondd(t) = bondd(t) = σd

adj(t,u) = true if dD: pos(t) = d(pos(u))

Helper functions (example)

type(t) = t2, pos(t) = (2, 2), bondS(t) = σ2

S(W(pos(t)) = pos(v) = (1,1)

j

t

uv

t = (t2, (2, 2))

u = (t2, (3, 1))

v = (t1, (1, 1))

σ1 : σ2 : σ3 :

i

Configuration

A configuration is a set of tiles, with exactly one tile in every location (i, j)

For any configuration A, notation A(i, j) indicates the tile at location (i, j)

Practically, specify a set of non-empty tiles; all other tiles are implicitly empty.

Strength functions (definition)

Bond strength function: g(σ, σ´) : Σ2 → ZDefined for all pairs of bonds (including null)

Tile strength function: Γ(t, u)Defined for adjacent tiles t and uEquals to g(σ, σ´) where σ and σ´ are the

bond types of the adjacent sides of t and u respectively

Strength functions (example)

σ1 σ2 σ3 null

σ1

σ2

σ3

null

1 0 0 0

0 7 0 0

0 0 2 0

0 0 0 0

Example of g

t u

v

Γ(t, u) = g(σ1, σ1) = 1

Γ(u, v) = g(σ2, σ2) = 7

Formally:

otherwise0,

d direction in u)adj(t, if (u)),bond(t),g(bondu)Γ(t, dd 1

Strength function properties

g is symmetric g(σ, null) = 0 g is non-negative g is diagonal

Diagonal means that only matching bond types can interact!

Tile system T = (T, ts, g, τ)

A tile system T is defined by:

A set Τ of tile types

A seed tile ts, with type(ts) T

A strength function g

A threshold τ

t1 t2 . . .

Self-Assembly (definition)

Self assembly is defined by a relation between configurations

Suppose A and B are identical configs, except for t, which exists in B but not in A:A(pos(t)) = empty, B(pos(t)) = t

Self assembly:A→B if ΣdDΓ(t, A(d(pos(t)))) ≥ τ

Self-Assembly (example)

σ1 σ2 σ3 null

σ1

σ2

σ3

null

1 0 0 0

0 7 0 0

0 0 2 0

0 0 0 0

Contents of g

x

y

x t

y

Configuration A Configuration B

A→B only if Γ(t,x) + Γ(t,y) = 1 + 7 ≥ τ

Transitive Closure

The reflexive transitive closure of → is denoted as . (That is, we say that A B if we can keep adding tiles to A and reach B: A → A2 → A3 → ... → B).

We are interested in self-assemblies from a single seed tile ts!

Assemblies for a tile system T

Prod(T) = {A, such that {ts} A)All the configurations reachable from ts in T.

Term(T) = {A Prod(T), B≠A: A B}All the terminal assemblies reachable from ts

T uniquely produces A if Term(T) = {A}.

Threshold τ

A common choice is τ = 2, where the strength function ranges over {0, 1, 2}.

Systems with τ = 1 and a strength function ranging over {0, 1} are rather limited.

Example of T = (T, ts, g, τ)

Set T of tile types:

ts = (ts, (2, 4)) g = I = diag{1} τ = 1

N E S W

ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

ts

σ1

t1

σ2 σ1

t2

σ2

σ3

What does it assemble to?N E S W

ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

ts

σ1


ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

ts

σ1

t1

σ2 σ1


ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

ts

σ1

t1

σ2 σ1

t2

σ2

σ3


ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

Shape scaling

Coordinated* shape of assembly A:SA = {(i, j) such that A(i, j) ≠ empty}

(This is a single connected component) For a set of locations S, and cZ+, define

a c-scaling of S:Sc = {(i, j) such that }(This is a magnification of S by a factor of c)

Sj/c,i/c

*shape within a fixed coordinate system

Shape equivalence

Coordinated shapes S1 and S2 are scale-equivalent if c,dZ+ such that S1

c = S2d

Coordinated shapes S1 and S2 are translation-equivalent if they can be made equivalent by translation.

We write S1 S2 if c,dZ+ such that S1c is

translation-equivalent to S2d

Shape equivalence (example)

S1 S2 S3

S1 is translation-equivalent to S2

S2 is scale-equivalent to S3

S1 S3

Equivalence class

Scale-equivalence, translation-equivalence and are equivalence relations.

The equivalence class of coordinated shapes under the relation “” is called the shape Ŝ. We say that Ŝ is the shape of assembly A if SAŜ.

Tile complexity

The tile complexity of a coordinated shape S is the minimum number n of tile types needed by a tile system T to uniquely produce that shape.Ksa(S) = min{n: T with |T|=n, {ts} S*}

Definition directly extended for a shape Ŝ

*Formally, {ts} A, and S is the coordinated shape of A

Tile complexity (examples)

Let’s examine Stanford’s initials

ts

ts

Tile complexity (“S”)

Minimum Solution: Ksa=11

ts

N E S W

ts - - - σ1

t1 - σ1 - σ2

t2 - σ2 σ3 -

t3 σ3 - σ4 -

t4 σ4 σ5 - -

t5 - σ6 - σ5

t6 - - σ7 σ6

t7 σ7 - σ8 -

t8 σ8 - - σ9

t9 - σ9 - σ10

t10 - σ10 - -

As complex as it can get (need as many tile types as tiles)!

Tile complexity (“U”)

Possible solution: n=11

t9

t5

t2

t4

t3 t8

t10

t7

t1 t6ts

N E S W

ts - σ6 - σ1

t1 σ2 σ1 - -

t2 σ3 - σ2 -

t3 σ4 - σ3 -

t4 σ5 - σ4 -

t5 - - σ5 -

t6 σ7 - - σ6

t7 σ8 - σ7 -

t8 σ9 - σ8 -

t9 σ10 - σ9 -

t10 - - σ10 -

Can we do better?

Tile complexity (“U”)

Optimal solution: Ksa=7

t4

t5

t2

t4

t3 t3

t5

t2

t1 t6ts

N E S W

ts - σ6 - σ1

t1 σ2 σ1 - -

t2 σ3 - σ2 -

t3 σ4 - σ3 -

t4 σ5 - σ4 -

t5 - - σ5 -

t6 σ72 - - σ6

t7 σ8 - σ7 -

t8 σ9 - σ8 -

t9 σ10 - σ9 -

t10 - - σ10 -

Notice that the leg formed by t2 to t5 is the same as t7 to t10.

Modify t6 by placing bond type σ2 north (instead of σ7).

The assembly will reuse t2 to t5!

Binary Counter

Ksa=?

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Binary Counter

tsL

U

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Assembly grows in this direction

Matching “inputs”

“Outputs”

Type of tile can be determined by two input conditions, and can forward two outputs

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

I am the same as my SOUTH, unless it’s time to FLIP!

But how to determine if it’s time to FLIP?

Time to flip?

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0Black = Same as SOUTH

Red = Different than south, cause it’s time to flip!

FLIP up to the first 1 from the right, therefore nice property to propagate EAST-to-WEST

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

FLIP input

FLIP output

NUM input

NUM output

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Flip OFF Flip ON

NUM = 0

NUM = 1

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Flip OFF Flip ON

NUM = 0

NUM = 1

0

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Flip OFF Flip ON

NUM = 0

NUM = 1

0 1

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Flip OFF Flip ON

NUM = 0

NUM = 1

0

1

1

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Flip OFF Flip ON

NUM = 0

NUM = 1

0

01

1

Binary Counter

Ksa=7

tsL

U

1

00

1

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Binary Counter

0

0

0

1

1

10

1

1

0

0

1

1 1

0

1

1

1

1

0

0

0

0

0

0

0

0

0

1 0 0 0

Variant of Sierpinski Triangle

Ksa=7

1

1 1

11

1 1

0

11

01 00 1

1 11 0 0 1

0 1 0 1 0

1 1

0 0

Compute the tile complexity?

Given a coordinated shape S, can I compute its complexity Ksa(S)?(Combinatorial optimization problem)

Yes, but it’s hard (NP-complete)!

Turing Machine

A Turing Machine is a simple computation model which can be used to compute any function.

Most common form:1-dimensional infinite tape with 0s and 1sHead located at some position on the tape Internal state, programCan read/write tape, move head

Kolmogorov Complexity

The Kolmogorov complexity of a binary string x with respect to a universal Turing machine U is the smallest program p that outputs x: KU(x) = min{|p|: U(p) = x}

The Kolmogorov complexity of a shape S is the smallest program that can output it as a list of locations encoded in some binary form <S>: K(S) = min{|s|: U(s) = <S>} (directly extended for Ŝ)

Complexity theorem

a0, b0, a1, b1, such that Ŝ:

a0K(Ŝ)+b0 ≤ Ksa(Ŝ)logKsa(Ŝ) ≤ a1K(Ŝ)+b1

Therefore K = Θ(nlogn)

Proof for first part

Create a Turing machine that simulates self-assembly.Write a fixed-size program p0 to do the

assembly of a tile system T that reproduces Ŝ.Final program p = p0 + <T>

<T> is binary encoding of tile system T. What is the size of <T>?

In other words, how many bits are needed to describe T?

Size of <T> = <(T, ts, g, τ)>

T contains n = Ksa(Ŝ) tile types. Each tile type is specified by four bonds At most 4n different bond types For every tile type, 4log(4n) bits

Therefore, T is specified in 4nlog(4n) bits Note: ts is included in T.

g, τ are not directly needed; for each tile type, specify which of the 16 subsets of D it can bind Needs 16n bits

Proof (cont)

Therefore final size of program: cnlogn+d

K(Ŝ) = min{programs} ≤ cnlogn+d

Thus:

a0K(Ŝ)+b0 ≤ Ksa(Ŝ)logKsa(Ŝ)

Compute the tile complexity? (v2)

Given a shape class Ŝ, can I compute its complexity Ksa(Ŝ)?

No, Ksa(Ŝ) is uncomputable!

In other words, language L is undecidable:

L = { (ℓ, n) s.t. ℓ = <S> for some SŜ and Ksa(Ŝ)≤n }

dna self-assembly the complexity of self-assembled shapes

Documents