adaptive annealing: a near-optimal connection between sampling and counting daniel Štefankovi č...

54
Adaptive annealing: a near- optimal connection between sampling and counting Daniel Štefankovič (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech)

Post on 19-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Adaptive annealing: a near-optimal connection between

sampling and counting

Daniel Štefankovič(University of Rochester)

Santosh VempalaEric Vigoda(Georgia Tech)

Page 2: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

independent sets

spanning trees

matchings

perfect matchings

k-colorings

Counting

Page 3: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(approx) counting sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86,

random variables: X1 X2 ... Xt

E[X1 X2 ... Xt]

= O(1)V[Xi]

E[Xi]2

the Xi are easy to estimate

= “WANTED”

the outcome of the JVV reduction:

such that1)

2)

squared coefficient of variation (SCV)

Page 4: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

E[X1 X2 ... Xt]

= O(1)V[Xi]

E[Xi]2

the Xi are easy to estimate

= “WANTED” 1)

2)

O(t2/2) samples (O(t/2) from each Xi)

give 1 estimator of “WANTED” with prob3/4

Theorem (Dyer-Frieze’91)

(approx) counting sampling

Page 5: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

JVV for independent sets

P( ) 1

# independent sets =

GOAL: given a graph G, estimate the number of independent sets of G

Page 6: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

JVV for independent sets

P( )

P( ) =

?

?

?

?

?P( )?P( )P( )

X1 X2 X3 X4

Xi [0,1] and E[Xi] ½ = O(1)V[Xi]

E[Xi]2

P(AB)=P(A)P(B|A)

Page 7: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

SAMPLERORACLEgraph G

random independentset of G

JVV: If we have a sampler oracle:

then FPRAS using O(n2) samples.

Page 8: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

SAMPLERORACLEgraph G

random independentset of G

JVV: If we have a sampler oracle:

then FPRAS using O(n2) samples.

SAMPLERORACLE

graph G set fromgas-model Gibbs at

ŠVV: If we have a sampler oracle:

then FPRAS using O*(n) samples.

Page 9: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

O*( |V| ) samples suffice for counting

Application – independent sets

Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O*( |V| ) for graphs of degree 4.

Total running time: O* ( |V|2 ).

Page 10: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Other applications

matchings O*(n2m) (using Jerrum, Sinclair’89)

spin systems: Ising model O*(n2) for <C

(using Marinelli, Olivieri’95)

k-colorings O*(n2) for k>2 (using Jerrum’95)

total running time

Page 11: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

easy = hot

hard = cold

Page 12: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

1

2

4Hamiltonian

0

Page 13: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Hamiltonian H : {0,...,n}

Big set =

Goal: estimate |H-1(0)|

|H-1(0)| = E[X1] ... E[Xt ]

Page 14: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Distributions between hot and cold

(x) exp(-H(x))

= inverse temperature

= 0 hot uniform on = cold uniform on H-1(0)

(Gibbs distributions)

Page 15: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(x)

Normalizing factor = partition function

exp(-H(x))

Z()= exp(-H(x)) x

Z()

Distributions between hot and cold

(x) exp(-H(x))

Page 16: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Partition function

Z()= exp(-H(x)) x

have: Z(0) = ||want: Z() = |H-1(0)|

Page 17: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(x) exp(-H(x))Z()

Assumption: we have a sampler oracle for

SAMPLERORACLE

graph G

subset of Vfrom

Page 18: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(x) exp(-H(x))Z()

Assumption: we have a sampler oracle for

W

Page 19: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(x) exp(-H(x))Z()

Assumption: we have a sampler oracle for

W X = exp(H(W)( - ))

Page 20: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

(x) exp(-H(x))Z()

Assumption: we have a sampler oracle for

W X = exp(H(W)( - ))

E[X] = (s) X(s) s

= Z()Z()

can obtain the following ratio:

Page 21: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Partition function

Z() = exp(-H(x)) x

Our goal restated

Goal: estimate Z()=|H-1(0)|

Z() =Z(1) Z(2) Z(t)

Z(0) Z(1) Z(t-1)Z(0)

0 = 0 < 1 < 2 < ... < t =

...

Page 22: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Our goal restated

Z() =Z(1) Z(2) Z(t)

Z(0) Z(1) Z(t-1)Z(0)...

How to choose the cooling schedule?

Cooling schedule:

E[Xi] =Z(i)

Z(i-1)

V[Xi]

E[Xi]2O(1)

minimize length, while satisfying

0 = 0 < 1 < 2 < ... < t =

Page 23: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Parameters: A and n

Z() =AH: {0,...,n}

Z() = exp(-H(x)) x

Z() = ak e- kk=0

n

ak = |H-1(k)|

Page 24: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Parameters

Z() =A H: {0,...,n}

independent sets

matchings

perfect matchings

k-colorings

2V

V!

kV

A

E

V

V

E

n

V!

Page 25: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Previous cooling schedules

Z() =A H: {0,...,n}

+ 1/n (1 + 1/ln A) ln A

“Safe steps”

O( n ln A)Cooling schedules of length

O( (ln n) (ln A) )

0 = 0 < 1 < 2 < ... < t =

(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)

(Bezáková,Štefankovič, Vigoda,V.Vazirani’06)

Page 26: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

No better fixed schedule possible

Z() =A H: {0,...,n}

Za() = (1 + a e ) A1+a

- n

A schedule that works for all

(with a[0,A-1])

has LENGTH ( (ln n)(ln A) )

Page 27: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Parameters

Z() =A H: {0,...,n}Our main result:

non-adaptive schedules of length *( ln A )

Previously:

can get adaptive scheduleof length O* ( (ln A)1/2 )

Page 28: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Existential part

for every partition function there exists a cooling schedule of length O*((ln A)1/2)

Lemma:

can get adaptive scheduleof length O* ( (ln A)1/2 )

there exists

Page 29: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

W X = exp(H(W)( - ))

E[X2]

E[X]2

Z(2-) Z()

Z()2= C

E[X]Z() Z()=

Express SCV using partition function

(going from to )

Page 30: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

f()=ln Z()Proof:

E[X2]

E[X]2

Z(2-) Z()

Z()2= C

C’=(ln C)/2

2-

Page 31: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

f()=ln Z() f is decreasing f is convex f’(0) –n f(0) ln A

Proof:

either f or f’changes a lot

Let K:=f

(ln |f’|) 1K

1

Page 32: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

f:[a,b] R, convex, decreasingcan be “approximated” using

f’(a)f’(b)

(f(a)-f(b))

segments

Page 33: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Proof:

2-

Technicality: getting to 2-

Page 34: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Proof:

2-

i

i+1

Technicality: getting to 2-

Page 35: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Proof:

2-

i

i+1i+2

Technicality: getting to 2-

Page 36: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Proof:

2-

i

i+1i+2

Technicality: getting to 2-

i+3

ln ln A

extrasteps

Page 37: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Existential Algorithmic

can get adaptive scheduleof length O* ( (ln A)1/2 )

there exists

can get adaptive scheduleof length O* ( (ln A)1/2 )

Page 38: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Algorithmic construction

(x) exp(-H(x))Z()

using a sampler oracle for

we can construct a cooling schedule of length

38 (ln A)1/2(ln ln A)(ln n)

Our main result:

Total number of oracle calls

107 (ln A) (ln ln A+ln n)7 ln (1/)

Page 39: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

Algorithmic construction

Page 40: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

Algorithmic construction

X is “easy to estimate”

Page 41: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

Algorithmic construction

we make progress (assuming B11)

Page 42: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

Algorithmic construction

need to construct a “feeler” for this

Page 43: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Algorithmic construction

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

need to construct a “feeler” for this

= Z()

Z()

Z(2)

Z()

Page 44: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Algorithmic construction

current inverse temperature

ideally move to such that

E[X] =Z()

Z()

E[X2]

E[X]2B2B1

need to construct a “feeler” for this

= Z()

Z()

Z(2)

Z()

bad “feeler”

Page 45: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Rough estimator for Z()

Z()

Z() = ak e- kk=0

n

For W we have P(H(W)=k) = ak e- k

Z()

Page 46: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Rough estimator for

Z() = ak e- kk=0

n

For W we have P(H(W)=k) = ak e- k

Z()

For U we have P(H(U)=k) = ak e- k

Z()

If H(X)=k likely at both , rough estimator

Z()

Z()

Page 47: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Rough estimator for

For W we have P(H(W)=k) = ak e- k

Z()

For U we have P(H(U)=k) = ak e- k

Z()

P(H(U)=k)P(H(W)=k)ek(-)=

Z()

Z()

Z()

Z()

Page 48: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Rough estimator for

Z() = ak e- kk=0

n

For W we have P(H(W)[c,d]) =

ak e- k

Z()

k=c

d

Z()

Z()

Page 49: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

P(H(U)[c,d])P(H(W)[c,d])

e ec(-)

e 1

If |-| |d-c| 1 then

Rough estimator for

We also need P(H(U) [c,d]) P(H(W) [c,d])to be large.

Z()

Z()

Z()

Z()

Z()

Z()

Page 50: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Split {0,1,...,n} into h 4(ln n) ln Aintervals [0],[1],[2],...,[c,c(1+1/ ln A)],...

for any inverse temperature there exists a interval with P(H(W) I) 1/8h

We say that I is HEAVY for

Page 51: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Algorithm

find an interval I which is heavy for the current inverse temperature

see how far I is heavy (until some *)

use the interval I for the feeler

repeat

Z()

Z()

Z(2)

Z()either * make progress, or * eliminate the interval I

Page 52: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

Algorithm

find an interval I which is heavy for the current inverse temperature

see how far I is heavy (until some *)

use the interval I for the feeler

repeat

Z()

Z()

Z(2)

Z()either * make progress, or * eliminate the interval I * or make a “long move”

Page 53: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia

if we have sampler oracles for

then we can get adaptive scheduleof length t=O* ( (ln A)1/2 )

independent sets O*(n2) (using Vigoda’01, Dyer-Greenhill’01)

matchings O*(n2m) (using Jerrum, Sinclair’89)

spin systems: Ising model O*(n2) for <C

(using Marinelli, Olivieri’95) k-colorings O*(n2) for k>2 (using Jerrum’95)

Page 54: Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia