mulitiview photometric stereo - waseda...
TRANSCRIPT
1
画像領域・対応点推定問題へのグラフカットの適用
Tatsunori TANIAIThe University of Tokyo
CRESET Symposium on MRF and Deep Learningat Waseda University
January 13, 2016
Solving Segmentation and Dense CorrespondenceProblems using Graph Cuts
2
Self-Introduction
Tatsunori TANIAI / 谷合竜典 (2nd year of PhD course at the University of Tokyo)
• Specialties: Optimization and its applications in computer vison
• Personal history
PhD
Master
Bachelor
Kosen
2014.4
2012.4
2009.4
National Institute of Technology, Tokyo College (東京高専)
Research Internship at Microsoft Research Asia (Advisor: Dr. Yasuyuki Matsushita)
Research Internship at Microsoft Research Redmond (Advisor: Dr. Sudipta Sinha)
Transferred to the University of Tokyo
Microsoft Research Asia Fellow 2015
Joined Naemura Laboratory (University of Tokyo)
Joined Yoichi Sato Laboratory (University of Tokyo) / JSPS Young Research Fellow
Now
Graduation(expected)
2017.4
3
Collaborators
Prof. Yoichi Satoat Univ. of Tokyo
(Ph.D. advisor)
Prof. Yasuyuki Matsushitaat Osaka Univ.
(mentor at MSRA)
Dr. Sudipta Sinhaat MSR Redmond(mentor at MSR)
Prof. Takeshi Naemuraat Univ. of Tokyo
(Bachelor-Master advisor)
4
Before going on to my talk…
• My talk today is only about “my” past and on-going projects
– Higher-order MRF optimization for low level vision [CVPR 2015]
– Continuous MRF optimization for stereo matching [CVPR 2014] [Submitted to PAMI]
– Joint dense correspondence and cosegmentation [Submitted to CVPR 2016]
• My talk contains confidential information
• PLEASE do not expect too much about the deep learning part
5
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
6
MAP-MRF Inference
Unknownforeground mask
Observed data
Observation likelihood Prior
max𝑋
𝑃 𝑋|𝐷 = 𝑃 𝐷|𝑋 𝑃 𝑋 / 𝑃(𝐷)
− log ⋅
min𝑋
𝐸 𝑋 = 𝜙 𝐷|𝑋 + 𝜓 𝑋
Probability
Energy
?
Color distributionsfrom user scribbles
Spatialsmoothness
= 𝑖𝜙𝑖 𝑋𝑖 + 𝑖𝑗𝜙𝑖𝑗 𝑋𝑖 , 𝑋𝑗 + 𝐶𝜙𝐶 𝑋𝐶
Pairwise/1st order MRF
Unary Pairwise Higher-order
𝑋 = 1
𝑋 = 0
Posteriori
Higher-order MRF
7
Important Properties in MAP-MRF by Graph Cuts
Submodular(discrete convexity)
Pairwise(function form)
Binary(label space)
Yakusoku-no-chi約束の地
(optimally solved)
Non-submodular(discrete non-convexity)
Higher-order(function form)
Multi-label(label space)
Make problems easier Make problems harder
Decompose/convert hard problems into easy problems.
8
Graph Cuts for MAP-MRF Inference
Optimal inference of binary MRFs by graph cuts [Kolmogorov & Zabih, PAMI 04]
• Pairwise: 𝐸 𝑋 = 𝜙𝑖 𝑋𝑖 + 𝜙𝑖𝑗 𝑋𝑖 , 𝑋𝑗
• Binary: 𝑋𝑖 ∈ {0,1}
• Submodular: 𝜙 0,0 + 𝜙 1,1 ≤ 𝜙 1,0 + 𝜙 0,1
Sinkterminal
Sourceterminal
min-cut
Max-flow min-cut problem
Equivalent
e.g.) Potts model 𝜙 𝑋𝑖 , 𝑋𝑗 = 𝑋𝑖 − 𝑋𝑗
Approximate inference of multi-label MRFs by α-expansion algorithm [Boykov+, PAMI 01]
• Pairwise: 𝐸 𝑋 = 𝜙𝑖 𝑋𝑖 + 𝜙𝑖𝑗 𝑋𝑖 , 𝑋𝑗• Multi-label: 𝑋 ∈ {0,1,⋯ , 𝐾}
for each 𝛼 ∈ {0,1,⋯ , 𝐾}𝑋𝑡+1 = arg min 𝐸(𝑋) where 𝑋𝑖 ∈ {𝑋𝑖
𝑡 , 𝛼}
●move●move●move●move●move●move●move
Image coutesty: N.Komodakis, P.Torr, V.Kolmogorov, Y.Boykov “Discrete Optimizationin Computer Vision”, Tutorial at ICCV 2007
Solved by GC
9
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
Non-Submodular
Higher-orderBinary
Non-Submodular
PairwiseBinary
10
Binary Energy Minimization
Binary variables in two forms:
• Find a binary labeling: 𝑠𝑝 ∈ 0,1
• Find a region: 𝑆 = 𝑝 | 𝑠𝑝 = 1
Image segmentation Image binarization
𝑆
𝑠𝑝 = 1
𝑠𝑝 = 0Ω
𝐸 𝑆 = 𝑄 𝑆 + 𝑅 𝑆
Quadratic Higher-order (our focus)
= 𝑝𝑚𝑝𝑠𝑝 + 𝑝𝑞𝑚𝑝𝑞𝑠𝑝𝑠𝑞 + 𝑅 𝑆
11
Approximation Approach
How to find good linear coefficients 𝒉𝒑?
• Gradient descent approach
• Bound optimization approach (our focus)
𝐸 𝑆 = 𝑄 𝑆 + 𝑅 𝑆Quadratic Higher-order
≃ 𝑄 𝑆 + ℎ, 𝑆 ℎ, 𝑆 = 𝑝 ℎ𝑝𝑠𝑝where
Linear approximation
1. Approximate 𝑹 𝑺 by a linear function:
𝑆𝑡+1 = argmin𝑆 𝐸 𝑆
2. Minimize approximated 𝑬 𝑺 by graph cuts:
Until convergence
Repeat
12
Ener
gy
Solution space
Taylor-based linear approximation
Fast Trust Region [Gorelick+ CVPR ’13, ‘14]
Gradient Descent Approach
+ Trust region 𝑆 − 𝑆𝑡 < 𝜏
𝑆𝑡
𝑆𝑡+1
Locally approximates 𝐸(𝑆) at current 𝑆𝑡
May worsen solutions
𝐸(𝑆)
13
Ener
gy
Solution space
Our Approach (Bound Optimization)
𝑆𝑡
𝑆𝑡+1
Globally approximates 𝐸(𝑆) using current 𝑆𝑡
Never worsens solutions: 𝐸 𝑆𝑡+𝑡 ≤ 𝐸(𝑆𝑡)
Piecewise-linear upper-bounds
updated in a coarse-to-fine manner.
𝐸 𝑆|𝑆𝑡 ≥ 𝐸(𝑆)
𝐸 𝑆|𝑆𝑡+1
14
Contributions
• Achieve state-of-the-art performances
• Generalize previous bound optimization methods:
– Submodular Supermodular Procedure [Narasimhan+ UAI ‘05]
– Bhattacharyya Measure Graph Cuts [Ayed+ CVPR ‘10]
– Auxiliary Cuts [Ayed+ CVPR ‘13]
– Local Submodular Approximation (AUX) [Gorelick+ CVPR ‘14]
– Parametric Pseudo Bound Cuts [Tang+ ECCV ‘14]
15
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
– Revising SSP
– Proposed method
– Experiments
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
16
Submodular-Supermodular Procedure (SSP) [Narasimhan+, UAI05]
SSP: minimization method of supermodular functions
𝑅(𝑆)
Ω∅
𝑅 𝑆 = 𝑣0 − 𝑆 2
Area-size constraints
𝑅 𝑆 =
𝑧∈{𝑏𝑖𝑛𝑠}
ℎ𝑖𝑠𝑡𝑧0 − ℎ𝑖𝑠𝑡𝑍(𝑆)
𝑝
𝑳𝒑-dist. btw histograms
i.e. size constraints for each color binSupermodular(similar to convex func)
Minimization is NP-hard
17
Greedy Approximation by Permutation
What if we know how likely each 𝑠𝑖 is “1”?
Lik
elih
oo
d
}
}
𝜎1 𝜎2 𝜎3 𝜎4 𝜎5𝑆 = {
𝑅(𝑆)
∅ Ω
ℎ = {
1. Permute nodes by their likelihoods
Then we can
2. Compute a linear approximation function
ℎ, 𝑆 ≃ 𝑅 𝑆as energy transitions along the permutation 𝜎.
Ordering 𝜎
18
Permutation 𝜎 by Distance [Rother+, CVPR06]
Lik
elih
oo
d
𝜎1 𝜎2 𝜎3 𝜎4 𝜎5
𝑺𝒕
Distance from boundary
Really reliable?
Current S
19
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
– Revising SSP
– Proposed method
– Experiments
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
20
Grouped Permutation
• Ignore unreliable permutation 𝝈 by grouping• Use finer bounds as iterations proceed
𝜎1 𝜎2 𝜎3 𝜎4 𝜎5
Lik
elih
oo
d
Sort 𝜎
SSP [UAI 05]
𝜎1 𝜎2 𝜎3 𝜎4 𝜎5
Sort & group
Proposed AC [CVPR 13]
∞
= 𝑆𝑡
Not requirespermutation 𝜎
Always finestapproximation
Always coarsestapproximation
Adaptive“coarse-to-fine”
21
Visual Comparison by 2 Variables
𝑬 𝑺 = 2 𝑠1 + 𝑠2 − 1 + 𝑠2
𝑠1𝑠2
SSP (with true perm. 𝜎)
𝜎: 𝑠1 → 𝑠2
SSP (with wrong perm. 𝜎)
𝜎: 𝑠2 → 𝑠1
Our grouped bound
Min
Grouping yields better bounds when 𝝈 is inaccurate
Tight only along 𝝈
22
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
– Revising SSP
– Proposed method
– Experiments
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
23
0%
1%
2%
3%
4%
5%
32 64 96 128 160 192
Err
or
Rate
Number of Bins per Channel
Image Segmentation Results
PROPOSED
PROPOSED
pPBC [Tang+ ECCV ‘14]
AC [Ayed+ CVPR ‘13]
FTR [Gorelick+ ECCV ‘13]
𝐸 𝑆 = 𝑄 𝑆 +
𝑧∈{𝑏𝑖𝑛𝑠}
ℎ𝑖𝑠𝑡𝑧 − ℎ𝑖𝑠𝑡𝑧(𝑆)2
INPUT: RGB color histogramlearned from ground truth.
Pairwisesmoothness term
24
PROPOSED pPBC [Tang+ ECCV ‘14]
AC [Ayed+ CVPR ‘13] SSP [Narasimhan+ UAI ‘05]FTR [Gorelick+ ECCV ‘13]
Ground truth
Image Segmentation Results
L2-distance, RGB histograms with 643 bins
25
Ground truth PROPOSED pPBC [Tang+ ECCV ‘14]
AC [Ayed+ CVPR ‘13] SSP [Narasimhan+ UAI ‘05]FTR [Gorelick+ ECCV ‘13]
Image Segmentation Results
L2-distance, RGB histograms with 643 bins
26
Deconvolution
27
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation
Non-Convex
PairwiseContinuous
28
Stereo Matching
𝑥
𝑦𝑧
𝑝′
Left
Right
Estimate depth z (or disparity) by maximizing patch similarity.
29
Over-Parameterized Stereo Matching [Bleyer+, BMVC 2011]
𝑥
𝑦𝑧
𝑝′
Left
Right
Estimate local tangent planes (depth 𝒛 + normal 𝒏)by maximizing patch similarity.
30
𝑑𝑞 = 𝑎𝑢 + 𝑏𝑣 + 𝑐
(or disparity plane: 𝑑𝑝 = 𝑎𝑝𝑢 + 𝑏𝑝𝑣 + 𝑐𝑝)
𝐸 𝑻 = 𝑝𝐷𝑝ℎ𝑜𝑡𝑜(𝑇𝑝)
Pairwise MRF formulation:
Over-parameterized Stereo Formulation
𝑇𝑝 =1 − 𝑎𝑝 −𝑏𝑝 −𝑐𝑝0 1 0
Over-parametrized disparity
Patch-based photo-consistency term[Blayer+ BMVC ‘11]
Curvature-based smoothness term[Olsson+ CVPR ‘13]
Minimize abs( ) + abs( ) to enforce piecewise linear disparity
Image coordinates 𝑢
𝑝 𝑞𝑑𝑝Dis
pa
rity
Estimate 𝑻𝒑(𝒂, 𝒃, 𝒄) for each of densely overlapping patches
via energy minimization on pairwise MRF models.
RL
+ 𝑝,𝑞𝑅𝑠𝑚𝑜𝑜𝑡ℎ(𝑇𝑝, 𝑇𝑞)
31
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
– Proposed method
– Experiments
– Fast implementation
• Joint dense correspondence and cosegmentation
32
Conventional α-expansions[Boykov+ TPAMI ‘02]
Local Expansion Moves
Spatially localizedlabel-space searching
Our local α-expansions
Fusion via graph cuts
Current solution
α
Intractable due to our infinite label space
ProposalsMany
α’sProposals
33
Local Expansion Moves
Current solution Local α-expansion(disparity plane patch)
𝛼Choose Perturb𝑻𝒑 + 𝚫
Improved solution
Spatial propagation and randomized searchsimilarly to PatchMatch inference [Barnes+ ToG ‘09]
3x3 cells
Fusion via graph cuts
Current solution
34
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
– Proposed method
– Experiments
– Fast implementation
• Joint dense correspondence and cosegmentation
35
Results for Middlebury Benchmark
After 10 iterations
Disparity map
After post-proc.
(Error rates by 0.5-pixel error threshold)
Error map
1st Rank even without post-processing
White: correctBlack: incorrectGray: incorrect but occluded
36
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
– Proposed method
– Experiments
– Fast implementation
• Joint dense correspondence and cosegmentation
37
Parallelization of Local Expansion Moves
0 1 2 3 4 5 6 7
0 0 1 2 3 0 1 2 3
1 4 5 6 7 4 5 6 7
2 8 9 10 11 8 9 10 11
3 12 13 14 15 12 13 14 15
4 0 1 2 3 0 1 2 3
5 4 5 6 7 4 5 6 7
6 8 9 10 11 8 9 10 11
7 12 13 14 15 12 13 14 15
Cell index 𝑖
Ce
ll in
de
x 𝑗
0 1 2 3 4 5 6 7 8
0
1
2
3
4
5
6
7
8
Cell index 𝑖
Ce
ll in
de
x 𝑗
Divide into 16 groups of mutually-disjoint (parallelizable) local expansion moves.Computations of patch-matching cost (data term), min-cut, etc. can be done in parallel.
Scheduling (16 groups) Mutually-disjointlocal expansion moves
The region of a local expansion move.(3x3 cells)
38
𝐼 𝐼′
𝑝𝐷𝑝ℎ𝑜𝑡𝑜(𝑇𝑝)
Fast Computation by Cost-Volume Filtering
Patch-based photo-consistency term
𝐼′𝐼
𝑝
Adaptive window(bilateral-filter weight)
𝑊
𝑊
Bottle-neck: naïve computation of each
matching cost 𝐷𝑝ℎ𝑜𝑡𝑜 𝑇𝑝 is 𝑂 𝑊
𝐷𝑝ℎ𝑜𝑡𝑜 𝑇𝑝 =
𝑞
𝑊𝑝𝑞 𝐼𝑞 − 𝐼𝑞′′
The region of a local expansion move
Union of matching windows (filtering region)
𝑊
Fast computation1. Compute raw matching costs of a filtering region2. Apply edge-aware constant-time filtering
e.g.) guided image filtering [He+ ECCV 10, PAMI 13]
Computation of 𝑫𝒑𝒉𝒐𝒕𝒐 𝑻𝒑 ≃ 𝑶 𝟏
PatchMatch filter [Lu+ CVPR 13]
39
Running Time Comparison
0.62
0.63
0.64
0.65
0.66
0.67
0.68
0 100 200 300 400 500 600
Rela
tive E
nerg
y F
unction V
alu
e
Running Time [seconds]
LE-BF (CPUx1)
LE-BF (CPUx4)
LE-GF (CPUx1)
LE-GF (CPUx4)
LE-BF (GPU+CPUx4)
Fast cost-volume filtering (5.3x)
Until 1900s
1 to 4 CPU cores (3.5x)
4 CPU cores + GPU (19x)
40
Overview
• Introduction
• Higher-order MRF optimization for low level vision [CVPR 2015]
• Continuous MRF optimization for stereo matching [CVPR 2014]
• Joint dense correspondence and cosegmentation [on-going]
Non-Convex
PairwiseContinuous
+ α
41
Summary and References
• Higher-order MRF optimization for low level vision– Piecewise-linear approximation bounds updated in a coarse-to-fine manner
Taniai, Matushita, Naemura: “Superdifferential Cuts for Binary Energies” [CVPR 2015]
• Continuous MRF optimization for stereo matching– Local expansion moves for PatchMatch-like inference by GC (or originally “locally shared labels”)
Taniai, Matsushita, Naemura: “Graph Cut based Continuous Stereo Matching using Locally Shared Labels” [CVPR 2014]
– Fast implementation by parallelization and local cost-volume filtering Taniai, Matsushita, Sato, Naemura: [Submitted to PAMI]
• Joint dense correspondence and cosegmentation– Dynamic hierarchical regularization and two-pass optimization
Taniai, Sinha, Sato [Submitted to CVPR 2016]
42
Thanks Again to My Collaborators
Prof. Yoichi Satoat Univ. of Tokyo
(Ph.D. advisor)
Prof. Yasuyuki Matsushitaat Osaka Univ.
(mentor at MSRA)
Dr. Sudipta Sinhaat MSR Redmond(mentor at MSR)
Prof. Takeshi Naemuraat Univ. of Tokyo
(Bachelor-Master advisor)
THANK YOU FOR LISTENING