analysis of local affine model v2
TRANSCRIPT
Analysis of Local Affine Model電機三 黃馨平
Color Transfer
Times of day hallucinationPhotoshop
Video Color Transfer• Per-frame color transfer
• Computationally intensive• Times of day hallucination for a 3-min video• 180 sec x 25 frame/sec x 50 sec/frame = 5 hours
• Lack of temporal consistency• Use local affine models
Data-driven Hallucinationof Different Times of Day from a Single Outdoor Photo• Synthesize an image at a different time of
day from an input image• Exploit a database of time-lapse videos
seen as time passes
R
GB
(1) Search the video that look like the input image
(2) Find a frame at the time of the input and another frame at the target time
(3) Warp the frame to get the warped match frame and the warped target frame
(4) Model the color transforms using local affine model learned from and
(5) Apply the transform to input and get the hallucinated image
Local Affine Model• Models describe transforms between and • Wish that can be transformed to using the same • Add a regularization term using a global affine model G
• A Least-squares Optimization
• For each local image block, compute an affine model • Learn the color transformation between input and output• The output should have the same structure as the input
• Simpler at a local scale• Preserve the details
Local Affine Model
input output
𝐴1 𝐴1𝐴𝑘
Local Affine Model
affine model linear model
• Overlap W-1• Overlap W/2
• linearly interpolate pixel values weighted by the distance to the center of the block
SLIC Super-pixels• Partition an image into multiple segments• Pixels with the same label share certain
characteristics• A spatially localized version of k-means
clustering
Simple Linear Iterative Clustering (SLIC)
• Each pixel is associated to a feature vector• Initialize k-mean with center of each grid tile• Use the Lloyd algorithm to refine k-means centers and
clusters iteratively• Each pixel can be assigned to the 2x2 centers to grid tiles
adjacent to the pixel
(a) Standard k-means (b)SLIC
• regionSize: nominal size of the regions (superpixels) • regularizer: trade-off between clustering appearance and
spatial regularization
Comparison regularizer (log0.1)
img #0 0.3 0.7 1 1.3 1.7 2 2.3 2.7 3
1affine 46.9 46.9 46.9 46.9 46.9 46.9 46.8 46.7 46.7 46.6
linear 45.3 45.3 45.3 45.3 45.3 45.3 45.2 45.2 45.2 45.1
2affine 37.3 37.3 37.2 37.2 37.1 37.1 37 37 37 36.9
linear 36.1 36.1 36.1 36 36 35.9 35.9 35.9 35.9 35.8
3affine 37 37.1 37.1 37.1 37.2 37.1 37.1 37.1 37 37
linear 36.1 36.1 36.2 36.2 36.2 36.2 36.2 36.1 36.1 36.1
4affine 42.5 42.5 42.5 42.5 42.5 42.4 42.4 42.4 42.3 42.3
linear 41.5 41.5 41.5 41.4 41.4 41.4 41.4 41.3 41.3 41.3
5affine 35.4 35.4 35.3 35.2 35.1 34.9 34.7 34.5 34.3 34.2
linear 34.4 34.4 34.3 34.1 34 33.8 33.6 33.4 33.1 33
6affine 34.3 34.3 34.2 34.2 34.1 34.1 34.1 34 33.9 33.9
linear 33.4 33.4 33.4 33.4 33.3 33.3 33.3 33.3 33.2 33.2
7affine 36.9 36.8 36.7 36.7 36.6 36.5 36.4 36.3 36.3 36.2
linear 35.4 35.3 35.2 35.2 35.2 35.1 35.1 35 35 34.9
Avg. 38 38 38 38 37.9 37.9 37.8 37.7 37.7 37.6
PSNR (dB)
𝐴𝑘
method
img #
overlapping W-1 overlapping W/2 superpixel
linear affine linear affine linear affine
1 46.1 47.8 46.2 47.7 45.2 46.7
2 36.6 37.9 36.7 37.7 35.9 37.0
3 36.7 37.6 36.6 37.5 36.1 37.1
4 41.9 43.0 41.8 42.9 41.3 42.3
5 35.5 36.5 35.6 36.4 33.3 34.4
6 34.3 35.3 34.4 35.4 33.3 34.0
7 36.6 38.0 36.4 37.6 35.0 36.3
8 43.0 44.8 43.0 44.6 41.9 43.3
9 39.2 40.4 39.2 40.2 38.5 39.6
10 36.9 37.9 36.9 37.7 36.1 36.9
11 40.9 42.7 40.8 42.4 39.0 40.3
12 34.6 40.9 34.5 40.8 36.0 41.113 39.5 43.2 38.9 42.5 44.0 49.814 43.7 49.6 43.4 49.0 45.6 52.715 39.3 44.6 39.1 44.1 42.1 47.816 44.4 54.9 43.5 54.4 49.0 55.7
Avg. 39.3 42.2 39.2 41.9 39.5 42.2
Rank 1 3 2
Complexity wh wh/16 wh/64
PSNR (dB)block size=8x8
Transform Recipes for Efficient Cloud Photo Enhancement• Limited computing power and battery life of
mobile devices• Cloud image processing applications which
preserve the overall content of an image• Use least time and energy cost of
uploading and downloading
(1) Generate a compressed input of the input image (2) Upload this image along with the histogram of (3) Upsample and applies histogram transfer to compute a proxy input (4) Generate a proxy output (5) Compute a compact recipe using and (6) Download the recipe(7) Apply it on the original input
• Process input with a filter to produce output • Each recipe is specific to a given input-filter pair
Image Decomposition• Multi-scale decomposition• Work in color space
• Coarsely model the chrominance transformation and sophisticatedly model the luminance transformation
• Split and into levels and • First n levels represent the details at increasingly coarser scales• Last level is the low frequency residual which affects a large area
and affect significantly in final reconstruction• Combined high-frequency data
Layer n :the low frequency residual
Layer 0~n-1 :the details at increasingly coarser scales
Combined high-frequency data +
Laplacian stack
Compute Recipes (1)• The low frequency residual part of the transformation
• Chrominance Transformations
affine function
Compute Recipes (2)• Luminance Transformations
• Affine function - brightness and contrast• Multiplicative factor to each stack level - multiscale effects• Multiplicative factor to non-linearity terms
• Segment Function
Compute Recipes (3)
affine function multiplicative factorto each stack level
multiplicative factorto non-linearity terms
Lasso Regression• Include a penalty term to constrain the size of the
coefficients
• The penalty term Pα(β) interpolates between the L1 norm of β and the squared L2 norm of β
• As λ increases, the number of nonzero components of β decreases
Reconstructing• Perform the same decomposition• Apply the corresponding recipe coefficients to each term
• Up-sample the low residual term• Linearly interpolate other terms
# segments
img #2 4 6 8 10 12 14 16 18 20
1 49.5 49.7 49.9 50.0 50.1 50.2 50.3 50.4 50.5 50.6
2 38.6 39.0 39.2 39.4 39.6 39.8 40.0 40.1 40.3 40.4
3 38.7 39.1 39.4 39.6 39.8 40.1 40.3 40.5 40.7 40.8
4 44.4 44.7 44.9 45.1 45.3 45.4 45.6 45.8 46.0 46.1
5 38.2 38.6 38.9 39.1 39.4 39.6 39.8 40.0 40.2 40.4
6 37.3 37.6 37.8 37.9 38.1 38.2 38.4 38.5 38.6 38.7
7 39.4 39.9 40.2 40.5 40.8 41.0 41.3 41.5 41.7 42.0
8 45.8 46.0 46.2 46.3 46.4 46.5 46.7 46.7 46.8 46.9
9 41.0 41.2 41.4 41.5 41.6 41.7 41.8 42.0 42.0 42.1
10 38.5 38.8 39.1 39.3 39.5 39.7 39.9 40.1 40.3 40.4
11 43.7 44.0 44.3 44.4 44.6 44.7 44.9 45.0 45.1 45.2
12 42.4 42.7 42.7 42.8 42.9 43.0 43.0 43.1 43.2 43.2
13 47.5 49.0 49.3 49.4 49.5 49.7 49.7 49.8 49.9 50.0
14 51.7 52.2 52.3 52.4 52.4 52.5 52.5 52.6 52.6 52.7
15 47.0 48.0 48.2 48.3 48.4 48.4 48.5 48.6 48.6 48.7
16 55.4 55.6 55.7 55.8 55.9 55.9 56.0 56.0 56.0 56.1
Avg. 43.7 44.1 44.3 44.5 44.6 44.8 44.9 45.0 45.2 45.3
PSNR (dB)
Comparison
0 5 10 15 20 2542.5
43.0
43.5
44.0
44.5
45.0
45.5
# segments
PSNR (dB)
# layers
img #2 3 4 5 6 7 8 9 10
1 51.1 50.0 50.1 50.3 50.5 50.7 50.8 50.9 51.0
2 40.1 39.4 39.6 39.8 40.0 40.2 40.3 40.5 40.63 40.1 39.6 39.8 40.0 40.2 40.4 40.5 40.7 40.94 45.7 45.0 45.3 45.5 45.7 45.9 46.2 46.4 46.55 39.9 39.1 39.4 39.8 40.1 40.5 40.7 41.0 41.36 39.2 37.9 38.1 38.5 38.7 38.7 38.9 39.1 39.37 41.5 40.4 40.8 41.2 41.6 41.9 42.1 42.4 42.78 48.1 46.4 46.4 46.6 46.7 46.9 47.0 47.1 47.1
9 42.5 41.5 41.6 41.8 41.9 42.0 42.1 42.2 42.3
10 40.3 39.4 39.5 39.8 40.0 40.3 40.5 40.7 40.911 46.0 44.6 44.6 44.8 45.0 45.2 45.4 45.5 45.6
12 42.9 42.8 42.9 43.0 43.0 43.1 43.1 43.2 43.213 47.0 47.5 49.5 50.9 51.5 51.7 51.8 51.8 51.914 51.6 51.4 52.4 52.8 53.0 53.1 53.1 53.1 53.215 47.7 47.6 48.4 48.7 48.8 48.9 48.9 49.0 49.016 56.4 55.8 55.9 56.0 56.1 56.2 56.2 56.3 56.3
Avg. 45.0 44.3 44.6 45.0 45.2 45.3 45.5 45.6 45.7
PSNR (dB)
Comparison
# layers
PSNR (dB)
1 2 3 4 5 6 7 8 9 10 1143.5
44.0
44.5
45.0
45.5
46.0
method
img #
overlapping W-1 overlapping W/2 superpixelLaplacian
stacklinear affine linear affine linear affine
1 46.1 47.8 46.2 47.7 45.2 46.7 50.1
2 36.6 37.9 36.7 37.7 35.9 37.0 39.6
3 36.7 37.6 36.6 37.5 36.1 37.1 39.8
4 41.9 43.0 41.8 42.9 41.3 42.3 45.3
5 35.5 36.5 35.6 36.4 33.3 34.4 39.4
6 34.3 35.3 34.4 35.4 33.3 34.0 38.1
7 36.6 38.0 36.4 37.6 35.0 36.3 40.8
8 43.0 44.8 43.0 44.6 41.9 43.3 46.4
9 39.2 40.4 39.2 40.2 38.5 39.6 41.6
10 36.9 37.9 36.9 37.7 36.1 36.9 39.5
11 40.9 42.7 40.8 42.4 39.0 40.3 44.6
12 34.6 40.9 34.5 40.8 36.0 41.1 42.9
13 39.5 43.2 38.9 42.5 44.0 49.8 49.5
14 43.7 49.6 43.4 49.0 45.6 52.7 52.4
15 39.3 44.6 39.1 44.1 42.1 47.8 48.4
16 44.4 54.9 43.5 54.4 49.0 55.7 55.9
Avg. 39.3 42.2 39.2 41.9 39.5 42.2 44.6
Rank 2 4 3 1
PSNR (dB)
• Remove the low frequency residual• Add a layer in laplacian stack and the high frequency term
Modified Laplacian Stack Method (1)
Layer 0~n-1
Layer n
Combined high-frequency data +
Modified Laplacian Stack Method (2)
Remove the Non-linear Terms
Remove the Laplacian Stack Terms
Remove the Non-linear terms and the Laplacian Stack Term
PSNR (dB) relative PSNR (dB)residual ratio v
multiscale v v v non-linear v v v
1 50.1 +0.1 -0.7 -1.4 -2.4
2 39.6 +0.2 -1.1 -0.6 -1.9
3 39.8 +0.1 -1.4 -0.8 -2.4
4 45.3 +0.2 -1.0 -1.3 -2.4
5 39.4 +0.3 -1.1 -1.7 -3.0
6 38.1 -0.5 -1.1 -2.1 -2.7
7 40.8 +0.4 -1.6 -1.0 -3.1
8 46.4 +0.0 -0.8 -1.0 -1.9
9 41.6 +0.2 -0.7 -0.5 -1.4
10 39.5 +0.3 -1.0 -0.6 -1.8
11 44.6 +0.1 -1.2 -0.8 -2.1
12 42.9 +0.1 -1.0 -0.9 -2.1
13 49.5 +2.1 -5.3 1.8 -7.0
14 52.4 +0.6 -2.6 0.4 -3.4
15 48.4 +0.4 -3.2 0.2 -4.2
16 55.9 +0.2 -0.8 -0.1 -1.5
Avg. 44.6 0.3 -1.5 -0.6 -2.7
Rank 2 1 4 3 5
method
img #
overlapping W-1 overlapping W/2 superpixelLaplacian
stack resultlinear affine linear affine linear affine
1 46.1 47.8 46.2 47.7 45.2 46.7 50.1 50.2
2 36.6 37.9 36.7 37.7 35.9 37.0 39.6 39.8
3 36.7 37.6 36.6 37.5 36.1 37.1 39.8 39.9
4 41.9 43.0 41.8 42.9 41.3 42.3 45.3 45.5
5 35.5 36.5 35.6 36.4 33.3 34.4 39.4 39.7
6 34.3 35.3 34.4 35.4 33.3 34.0 38.1 37.6
7 36.6 38.0 36.4 37.6 35.0 36.3 40.8 41.2
8 43.0 44.8 43.0 44.6 41.9 43.3 46.4 46.4
9 39.2 40.4 39.2 40.2 38.5 39.6 41.6 41.8
10 36.9 37.9 36.9 37.7 36.1 36.9 39.5 39.8
11 40.9 42.7 40.8 42.4 39.0 40.3 44.6 44.7
12 34.6 40.9 34.5 40.8 36.0 41.1 42.9 43.0
13 39.5 43.2 38.9 42.5 44.0 49.8 49.5 51.6
14 43.7 49.6 43.4 49.0 45.6 52.7 52.4 53.0
15 39.3 44.6 39.1 44.1 42.1 47.8 48.4 48.8
16 44.4 54.9 43.5 54.4 49.0 55.7 55.9 56.1
Avg. 39.3 42.2 39.2 41.9 39.5 42.2 44.6 44.9
Rank 3 5 4 2 1
PSNR (dB)
Future Work
Video color transfer• Video color transfer using local affine models
• Find approximate nearest-neighbor matches of a video to a set of reference patches in the first frame• Patch match• Ring intersection approximate nearest neighbor search
• Compute local affine models between the original first frame and the enhanced first frame in the video
• Apply the transforms of the approximate nearest-neighbor matches to patches in the video
Recipe Coefficients• Use other regression method to stabilize the local affine
model coefficients
lasso regressionpseudo inverse
RR GR BR 1RRG GG BG 1GRB GB BB 1B
Reference[1] Transform Recipes for Efficient Cloud Photo Enhancement Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, Frédo Durand SIGGRAPH ASIA 2015 [2] Data-driven Hallucination for Different Times of Day from a Single Outdoor Photo YiChang Shih, Sylvain Paris, Frédo Durand, William T. Freeman SIGGRAPH ASIA 2013[3] SLIC Superpixels Compared to State-of-the-art Superpixel Methods Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Susstrunk
Appendix
Application
Dehazing
HDR ToningColor Harmonization
Color Grading
Color Constancy Auto Colors
Application - Times of Day Hallucination
Application – Photoshop
Closed-form Solution• Solution by iterative method
• Define