probabilistic inference lecture 4 – part 1
Post on 23-Feb-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Probabilistic InferenceLecture 4 – Part 1
M. Pawan Kumarpawan.kumar@ecp.fr
Slides available online http://cvc.centrale-ponts.fr/personnel/pawan/
Recap
Va Vb Vc
da db dc
2
5
4
2
6
3
0
1 1
0
0
2
1
3
Label l0
Label l1
D : Observed data (image)
V : Unobserved variables
L : Discrete, finite label set
Labeling f : V L
Va is assigned lf(a)
Recap
Va Vb Vc
2
5
4
2
6
3
0
1 1
0
0
2
1
3
Q(f; ) = ∑a a;f(a) + ∑(a,b)
ab;f(a)f(b)
Label l0
Label l1
bc;f(b)f(c
)a;f(a
) da db dc
Recap
Va Vb Vc
2
5
4
2
6
3
0
1 1
0
0
2
1
3
Q(f; ) = ∑a a;f(a) + ∑(a,b)
ab;f(a)f(b)
Label l0
Label l1
f* = argminf Q(f; )
da db dc
Recap
Va Vb Vc
2
5
4
2
6
3
0
1 1
0
0
2
1
3
Q(f; ) = ∑a a;f(a) + ∑(a,b)
ab;f(a)f(b)
Label l0
Label l1
f* = argminf Q(f; )
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Mathematical Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
x is a feasible point gi(x) ≤ 0, hi(x) = 0
x is a strictly feasible point gi(x) < 0, hi(x) = 0
Feasible region - set of all feasible points
Convex Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
Objective function is convex
Feasible region is convex
Convex set? Convex function?
Convex Set
x1 x2
c x1 + (1 - c) x2
c [0,1]
Line Segment
Endpoints
Convex Set
x1 x2
All points on the line segment lie within the set
For all line segments with endpoints in the set
Non-Convex Set
x1 x2
Examples of Convex Sets
x1 x2
Line Segment
Examples of Convex Sets
x1 x2
Line
Examples of Convex Sets
Hyperplane aTx - b = 0
Examples of Convex Sets
Halfspace aTx - b ≤ 0
Examples of Convex Sets
Second-order Cone ||x|| ≤ t
t
x2
x1
Examples of Convex Sets
Semidefinite Cone {X | X 0}
aTXa ≥ 0, for all a Rn
All eigenvalues of X are non-negative
aTX1a ≥ 0 aTX2a ≥ 0
aT(cX1 + (1-c)X2)a ≥ 0
Operations that Preserve ConvexityIntersection
Polyhedron / Polytope
Operations that Preserve ConvexityIntersection
Operations that Preserve ConvexityAffine Transformation x Ax + b
Convex Function
x
g(x)
Blue point always lies above red point
x1
x2
Convex Function
x
g(x)
g( c x1 + (1 - c) x2 ) ≤ c g(x1) + (1 - c) g(x2)
x1
x2
Domain of g(.) has to be convex
Convex Function
x
g(x)
x1
x2
-g(.) is concave g( c x1 + (1 - c) x2 ) ≤ c g(x1) + (1 - c) g(x2)
Convex FunctionOnce-differentiable functions
g(y) + g(y)T (x - y) ≤ g(x)
x
g(x)
(y,g(y))g(y) + g(y)T (x - y)
Twice-differentiable functions 2g(x) 0
Convex Function and Convex Sets
x
g(x)
Epigraph of a convex function is a convex set
Examples of Convex Functions
Linear function aTx
p-Norm functions (x1p + x2
p + xnp)1/p, p ≥ 1
Quadratic functions xT Q x
Q 0
Operations that Preserve ConvexityNon-negative weighted sum
x
g1(x)
w1
x
g2(x)
+ w2 + ….
xT Q x + aTx + bQ 0
Operations that Preserve ConvexityPointwise maximum
x
g1(x)
max
x
g2(x)
,
Pointwise minimum of concave functions is concave
Convex Optimization
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
Objective function is convex
Feasible region is convex
Linear Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
min g0Tx
s.t. giTx ≤ 0
hiTx = 0
Linear function
Linear constraints
Linear constraints
Quadratic Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
min xTQx + aTx + b
s.t. giTx ≤ 0
hiTx = 0
Quadratic function
Linear constraints
Linear constraints
Second-Order Cone Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
min g0Tx
s.t. xTQix + aiTx + bi ≤ 0
hiTx = 0
Linear function
Quadraticconstraints
Linear constraints
Semidefinite Programming
min g0(x)
s.t. gi(x) ≤ 0
hi(x) = 0
Objective function
Inequality constraints
Equality constraints
min Q X
s.t. X 0
Ai X = 0
Linear function
Semidefinite constraints
Linear constraints
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Integer Programming Formulation
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Unary Cost
Unary Cost Vector u = [ 5
Cost of V1 = 0
2
Cost of V1 = 1
; 2 4 ]
Labelling m = {1 , 0}
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Unary Cost
Unary Cost Vector u = [ 5 2 ; 2 4 ]T
Labelling m = {1 , 0}
Label vector x = [ -1
V1 0
1
V1 = 1
; 1 -1 ]T
Recall that the aim is to find the optimal x
Integer Programming Formulation
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Unary Cost
Unary Cost Vector u = [ 5 2 ; 2 4 ]T
Labelling m = {1 , 0}
Label vector x = [ -1 1 ; 1 -1 ]T
Sum of Unary Costs = 12
∑i ui (1 + xi)
Integer Programming Formulation
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Pairwise Cost
Labelling m = {1 , 0}
0 Cost of V1 = 0 and V1 = 00
00
0Cost of V1 = 0 and V2 = 0
3
Cost of V1 = 0 and V2 = 11 0
000 0
103 0
Pairwise Cost Matrix P
Integer Programming Formulation
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Pairwise Cost
Labelling m = {1 , 0}
Pairwise Cost Matrix P
0 0
00
0 3
1 000
0 010
3 0
Sum of Pairwise Costs14
∑ij Pij (1 + xi)(1+xj)
Integer Programming Formulation
2
5
4
2
0
1 3
0V1 V2
Label ‘0’
Label ‘1’Pairwise Cost
Labelling m = {1 , 0}
Pairwise Cost Matrix P
0 0
00
0 3
1 000
0 010
3 0
Sum of Pairwise Costs14
∑ij Pij (1 + xi +xj + xixj)
14
∑ij Pij (1 + xi + xj + Xij)=
X = x xT Xij = xi xj
Integer Programming Formulation
Constraints
• Uniqueness Constraint
∑ xi = 2 - |L|i Va
• Integer Constraints
xi {-1,1}
X = x xT
Integer Programming Formulation
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi {-1,1}
X = x xT
Convex Non-Convex
Integer Programming Formulation
Outline• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations– Linear Programming (LP-S)– Semidefinite Programming (SDP-L)– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
LP-S
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi {-1,1}
X = x xT
Retain Convex PartSchlesinger, 1976
Relax Non-ConvexConstraint
LP-S
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
X = x xT
Retain Convex PartSchlesinger, 1976
Relax Non-ConvexConstraint
LP-S
X = x xT
Schlesinger, 1976
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xij Vb
LP-S
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
X = x xT
Retain Convex PartSchlesinger, 1976
Relax Non-ConvexConstraint
LP-S
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1],
Retain Convex PartSchlesinger, 1976
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xij Vb
LP-S
Outline• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations– Linear Programming (LP-S)– Semidefinite Programming (SDP-L)– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
SDP-L
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi {-1,1}
X = x xT
Retain Convex PartLasserre, 2000
Relax Non-ConvexConstraint
SDP-L
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
X = x xT
Retain Convex Part
Relax Non-ConvexConstraint
Lasserre, 2000
x1
x2
xn
1
...
1 x1 x2 ... xn
1 xT
x X
=
Rank = 1
Xii = 1
Positive SemidefiniteConvex
Non-Convex
SDP-L
x1
x2
xn
1
...
1 x1 x2 ... xn
Xii = 1
Positive SemidefiniteConvex
SDP-L
1 xT
x X
=
Schur’s Complement
A B
BT C
=I 0
BTA-1 I
A 0
0 C - BTA-1B
I A-1B
0 I
0
A 0 C -BTA-1B 0
X - xxT 0
1 xT
x X
=1 0
x I
1 0
0 X - xxT
I xT
0 1
Schur’s Complement
SDP-L
SDP-L
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
X = x xT
Retain Convex Part
Relax Non-ConvexConstraint
Lasserre, 2000
SDP-L
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
Retain Convex Part
Xii = 1 X - xxT 0
Accurate
SDP-L
Inefficient
Lasserre, 2000
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations– Linear Programming (LP-S)– Semidefinite Programming (SDP-L)– Second Order Cone Programming (SOCP-MS)
• Comparison
• Generalization of Results
SOCP Relaxation
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
Xii = 1 X - xxT 0
Derive SOCP relaxation from the SDP relaxation
Further Relaxation
1-D ExampleX - xxT 0
X - x2 ≥ 0
For two semidefinite matrices, Frobenius inner product is non-negative
A A 0
x2 X
SOC of the form || v ||2 st
= 1
2-D Example
X11 X12
X21 X22
1 X12
X12 1
=X =
x1x1 x1x2
x2x1 x2x2
xxT =x1
2 x1x2
x1x2
=x2
2
2-D Example(X - xxT)
1 - x12 X12-x1x2
01 0
0 0 X12-x1x2 1 - x22
x12 1
-1 x1 1
C1 0 C1 0
2-D Example(X - xxT)
1 - x12 X12-x1x2
00 0
0 1 X12-x1x2 1 - x22
C2 0 C2 0
x22 1
-1 x2 1
2-D Example(X - xxT)
1 - x12 X12-x1x2
01 1
1 1 X12-x1x2 1 - x22
C3 0 C3 0
(x1 + x2)2 2 + 2X12
SOC of the form || v ||2 st
2-D Example(X - xxT)
1 - x12 X12-x1x2
01 -1
-1 1 X12-x1x2 1 - x22
C4 0 C4 0
(x1 - x2)2 2 - 2X12
SOC of the form || v ||2 st
SOCP Relaxation
Consider a matrix C1 = UUT 0
(X - xxT)
||UTx ||2 X C1
C1 0
Continue for C2, C3, … , Cn
SOC of the form || v ||2 st
Kim and Kojima, 2000
SOCP RelaxationHow many constraints for SOCP = SDP ?
Infinite. For all C 0
Specify constraints similar to the 2-D example
xi xj
Xij
(xi + xj)2 2 + 2Xij
(xi + xj)2 2 - 2Xij
SOCP-MS
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
Xii = 1 X - xxT 0
Muramatsu and Suzuki, 2003
SOCP-MS
x* = argmin 12
∑ ui (1 + xi) + 14
∑ Pij (1 + xi + xj + Xij)
∑ xi = 2 - |L|i Va
xi [-1,1]
(xi + xj)2 2 + 2Xij (xi - xj)2 2 - 2Xij
Specified only when Pij 0
Muramatsu and Suzuki, 2003
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results
Kumar, Kolmogorov and Torr, JMLR 2010
Dominating Relaxation
For all MAP Estimation problem (u, P)
A dominates B
A B
≥
Dominating relaxations are better
Equivalent Relaxations
A dominates B
A B
=
B dominates A
For all MAP Estimation problem (u, P)
Strictly Dominating Relaxation
A dominates B
A B
>
B does not dominate A
For at least one MAP Estimation problem (u, P)
SOCP-MS
(xi + xj)2 2 + 2Xij (xi - xj)2 2 - 2Xij
Muramatsu and Suzuki, 2003
• Pij ≥ 0(xi + xj)2
2- 1Xij =
• Pij < 0(xi - xj)2
21 -Xij =
SOCP-MS is a QP
Same as QP by Ravikumar and Lafferty, 2005
SOCP-MS ≡ QP-RL
LP-S vs. SOCP-MSDiffer in the way they relax X = xxT
Xij [-1,1]
1 + xi + xj + Xij ≥ 0
∑ Xij = (2 - |L|) xij Vb
LP-S
(xi + xj)2 2 + 2Xij
(xi - xj)2 2 - 2Xij
SOCP-MS
F(LP-S)
F(SOCP-MS)
LP-S vs. SOCP-MS
• LP-S strictly dominates SOCP-MS
• LP-S strictly dominates QP-RL
• Where have we gone wrong?
• A Quick Recap !
Recap of SOCP-MS
xi xj
Xij
1 1
1 1C =
(xi + xj)2 2 + 2Xij
Recap of SOCP-MS
xi xj
Xij
1 -1
-1 1C =
(xi - xj)2 2 - 2Xij
Can we use different C matrices ??
Can we use a different subgraph ??
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results– SOCP Relaxations on Trees– SOCP Relaxations on Cycles
Kumar, Kolmogorov and Torr, JMLR 2010
SOCP Relaxations on Trees
Choose any arbitrary tree
SOCP Relaxations on Trees
Choose any arbitrary C 0
Repeat over trees to get relaxation SOCP-T
LP-S strictly dominates SOCP-T
LP-S strictly dominates QP-T
Outline
• Convex Optimization
• Integer Programming Formulation
• Convex Relaxations
• Comparison
• Generalization of Results– SOCP Relaxations on Trees– SOCP Relaxations on Cycles
Kumar, Kolmogorov and Torr, JMLR 2010
SOCP Relaxations on Cycles
Choose an arbitrary even cycle
Pij ≥ 0 Pij ≤ 0OR
SOCP Relaxations on Cycles
Choose any arbitrary C 0
Repeat over even cycles to get relaxation SOCP-E
LP-S strictly dominates SOCP-E
LP-S strictly dominates QP-E
SOCP Relaxations on Cycles
• True for odd cycles with Pij ≤ 0
• True for odd cycles with Pij ≤ 0 for only one edge
• True for odd cycles with Pij ≥ 0 for only one edge
• True for all combinations of above cases
The SOCP-C Relaxation Include all LP-S constraints True SOCP
a b
0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
Submodular Non-submodular Submodular
The SOCP-C Relaxation
a b
Include all LP-S constraints True SOCP0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
Frustrated Cycle
The SOCP-C Relaxation
a b
Include all LP-S constraints True SOCP0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
LP-S Solution
a b
1
0-1 -1
0
0
0
b c
-1
01 1
0
0
0
c a
0-1 -1
0
0
0
1 -1 1
1
Objective Function = 0
The SOCP-C Relaxation
a b
Include all LP-S constraints True SOCP0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
LP-S Solution
a b
1
0-1 -1
0
0
0
b c
-1
01 1
0
0
0
c a
0-1 -1
0
0
0
1 -1 1
1
Define an SOC Constraint using C = 1
The SOCP-C Relaxation
a b
Include all LP-S constraints True SOCP0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
LP-S Solution
a b
1
0-1 -1
0
0
0
b c
-1
01 1
0
0
0
c a
0-1 -1
0
0
0
1 -1 1
1
(xi + xj + xk)2 3 + 2 (Xij + Xjk + Xki)
The SOCP-C Relaxation
a b
Include all LP-S constraints True SOCP0
01 1
0
0
0
b c
1
00 0
0
0
0
c a
0
01 1
0
0
0
0 1 0
SOCP-C Solution
Objective Function = 0.75
SOCP-C strictly dominates LP-S
a b
0.8
0.6-0.8 -0.8
-0.6
0.6
-0.6
b c
0.6
-0.6
0
0
c a
0
0
0.80.6
-0.60.4
-0.4 -0.4
0.4
-0.3
0.3 0.3
-0.3
The SOCP-Q Relaxation Include all cycle inequalities True SOCP
a b
c d
Clique of size n
Define an SOCP Constraint using C = 1
(Σ xi)2 ≤ n + (Σ Xij)
SOCP-Q strictly dominates LP-S
SOCP-Q strictly dominates SOCP-C
4-Neighbourhood MRF
Test SOCP-C
50 binary MRFs of size 30x30
u ≈ N (0,1)
P ≈ N (0,σ2)
4-Neighbourhood MRF
σ = 2.5
8-Neighbourhood MRF
Test SOCP-Q
50 binary MRFs of size 30x30
u ≈ N (0,1)
P ≈ N (0,σ2)
8-Neighbourhood MRF
σ = 1.125
Conclusions
• Large class of SOCP/QP dominated by LP-S
• New SOCP relaxations dominate LP-S
• But better LP relaxations exist
top related