network motifs: simple building blocks of complex networksckingsf/bioinfo-lectures/netmotifs.pdf ·...

27
Network Motifs: Simple Building Blocks of Complex Networks Milo et al., Science, 2002.

Upload: vanque

Post on 06-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Network Motifs: Simple Building Blocks of Complex Networks

Milo et al., Science, 2002.

Beyond Degree Distribution & Diameter

Network Motifs: Consider all possible ways to connect 3 nodes with directed edges:

(Milo et al., Science, 2002)

Finding Over-represented Subgraphs

For each possible motif M:Let cM be the number of times M occurs in graph G.Estimate pM = Pr[# occurrences ≥ cM] when edges are shuffled.Output M if pM < 0.01 and cM > 4.

To generate a random graph for the 3-node motifs:

a b

c d

a b

c d

a b

c d

a b

c d

Single and double edges swapped separately:

To Generate Random Graphs With a Given Distribution of (n-1)-node subgraphs:

Define an “energy” on a vector of occurrences of motifs:

Energy(Vrand) =�

M

|Vreal,M − Vrand,M |(Vreal,M − Vrand,M )

When Vrand = Vreal, the energy is 0.

Start with a randomized network.Until Energy is small:

Make a random swap.If the swap reduces the energy, keep itOtherwise, keep it with probability exp(-ΔE/T)

+

I

I

I

E

“Information processing” networks tend to use the same motifs

Other networks each had their own distinct collection of motifs.

Feed forward, e.g.: filter out transient signals.

(Milo et al., Science, 2002)

Quickly Finding Motifs858L

Network Motif Discovery Using Subgraph Enumeration and

Symmetry-Breaking Grochow & Kellis, RECOMB 2007

Backtracking (Recursive) Algorithm to Find Network Motifs

G =

H = (Small query graph)

(Large Network)

Def. Node g supports node h if the degrees of g and h are compatible.

Backtracking (Recursive) Algorithm to Find Network Motifs

G =

H = (Small query graph)

(Large Network)

Def. Node g supports node h if the degrees of g and h are compatible.

f : VH → VG

Basic Algorithm:

For each node g ∈ GFor each node h ∈ HIf h can’t support g: continue

Let f = {(g!h)}L = Extend(f, G, H)For q in L: Output image of q

Remove g from G

q : VH → VG

For every possible mapping of a single node from G to H

f is a partial map that maps g to h.

Then grow this partial map into many full maps

No need to consider g again (since we tried all its possible matches already)

Extend(f, G, H):

If domain(f) = H: return [f]

Let m = some node in N(domain(f))For each node u ∈ N(f(domain(f))):

If adding (m!u) to f keeps f as a valid isomorphism then:Extend(f∪{(m!u)}, G, H)

g

h

domain(f)

N(domain(f))

f(domain(f))

N(f(domain(f)))

m

u

u

Base case

Choose a node in HTry to map it to G

Extend(f, G, H):

If domain(f) = H: return [f]

Let m = some node in N(domain(f))For each node u ∈ N(f(domain(f))):

If adding (m!u) to f keeps f as a valid isomorphism then:Extend(f∪{(m!u)}, G, H)

g

h

domain(f)

N(domain(f))

f(domain(f))

N(f(domain(f)))

m

u

u

Base case

Choose a node in HTry to map it to G

Speed-up #1

• Every time we can choose a node, we pick the one that is “most constrained”:

- Pick the node that already has the most mapped neighbors

- If there are ties, choose the node with the highest degree

- If there are still ties, choose the node with highest 2nd order degree (total degree of the neighbors)

• Just a heuristic --- doesn’t hurt because we can pick the nodes in any order we want

- if a map that we are building can’t be completed, we want to know sooner rather than later.

Automorphisms & Orbits

C

A

B

D

E

F

C

A

B

D

E

F

Def. An automorphism is an isomorphism from a graph to itself.

Orbit of a node u is the set of nodes that u is mapped to under some automorphism

Automorphisms & Orbits

C

A

B

D

E

F

C

A

B

D

E

F

Def. An automorphism is an isomorphism from a graph to itself.

Orbit of a node u is the set of nodes that u is mapped to under some automorphism

Main Speedup (#2)

1

2

3

6Number each node of G

4

53

Main Speedup (#2)

1

2

3

6Number each node of G

f : VH → VG

A mapping f induces a numbering on H

4

53

Main Speedup (#2)

1

2

3

6Number each node of G

f : VH → VG

A mapping f induces a numbering on H4

5

3

Main Speedup (#2)

1

2

3

6Number each node of G

f : VH → VG

A mapping f induces a numbering on H4

5

3A

B

C

A < min{B,C}

C < min{B}

If we add these constraints, we get only one possible mapping

Adding Constraints, Larger Example

(Figure from Grochow & Kellis, 2007)

Basic Algorithm, differences for symmetry breaking

For each node g ∈ GFor each node h ∈ H s.t. we haven’t considered q ∈ Orbit(h):If h can’t support g: continue

Let f = {(g!h)}L = Extend(f, G, H, CH)For q in L: Output image of q

Remove g from G

q : VH → VG

Extend(f, G, H), symmetry breaking differences

If domain(f) = H: return [f]

Let m = some node in N(domain(f))For each node u ∈ N(f(domain(f))):

If adding (m!u) to f keeps f as a valid isomorphism and (m!u) obeys the constraints then:Extend(f∪{(m!u)}, G, H)

g

h

domain(f)

N(domain(f))

f(domain(f))

N(f(domain(f)))

m

u

u

Results: Running Time

(Figure from Grochow & Kellis, 2007)

Results: Benefit of Symmetry Breaking

Really Large “motifs”? Meaningful?

Occurred 27,720 times in the real yeast PPI

network (but rarely in a random network)

Really just a subgraph of this part of the yeast

PPI: choose 4 nodes from the clique and 3 nodes from the oval.

(Figure from Grochow & Kellis, 2007)

Other Advantages

• Since symmetry breaking ensures each match is output only once, they don’t need to keep track of which graphs they’ve already output

- save a lot of space

• Can be parallelized better

Spiritual Similarity to Color Coding

• Color Coding: make distinguishable things looks the same

• Symmetry Breaking: make indistinguishable things look different.