analysis of local affine model v2

Analysis of Local Affine Model電機三黃馨平

Color Transfer

Times of day hallucinationPhotoshop

Video Color Transfer• Per-frame color transfer

• Computationally intensive• Times of day hallucination for a 3-min video• 180 sec x 25 frame/sec x 50 sec/frame = 5 hours

• Lack of temporal consistency• Use local affine models

Data-driven Hallucinationof Different Times of Day from a Single Outdoor Photo• Synthesize an image at a different time of

day from an input image• Exploit a database of time-lapse videos

seen as time passes

R

GB

(1) Search the video that look like the input image

(2) Find a frame at the time of the input and another frame at the target time

(3) Warp the frame to get the warped match frame and the warped target frame

(4) Model the color transforms using local affine model learned from and

(5) Apply the transform to input and get the hallucinated image

Local Affine Model• Models describe transforms between and • Wish that can be transformed to using the same • Add a regularization term using a global affine model G

• A Least-squares Optimization

• For each local image block, compute an affine model • Learn the color transformation between input and output• The output should have the same structure as the input

• Simpler at a local scale• Preserve the details

Local Affine Model

input output

𝐴1 𝐴1𝐴𝑘

Local Affine Model

affine model linear model

• Overlap W-1• Overlap W/2

• linearly interpolate pixel values weighted by the distance to the center of the block

SLIC Super-pixels• Partition an image into multiple segments• Pixels with the same label share certain

characteristics• A spatially localized version of k-means

clustering

Simple Linear Iterative Clustering (SLIC)

• Each pixel is associated to a feature vector• Initialize k-mean with center of each grid tile• Use the Lloyd algorithm to refine k-means centers and

clusters iteratively• Each pixel can be assigned to the 2x2 centers to grid tiles

adjacent to the pixel

(a) Standard k-means (b)SLIC

• regionSize: nominal size of the regions (superpixels) • regularizer: trade-off between clustering appearance and

spatial regularization

Comparison regularizer (log0.1)

img #0 0.3 0.7 1 1.3 1.7 2 2.3 2.7 3

1affine 46.9 46.9 46.9 46.9 46.9 46.9 46.8 46.7 46.7 46.6

linear 45.3 45.3 45.3 45.3 45.3 45.3 45.2 45.2 45.2 45.1

2affine 37.3 37.3 37.2 37.2 37.1 37.1 37 37 37 36.9

linear 36.1 36.1 36.1 36 36 35.9 35.9 35.9 35.9 35.8

3affine 37 37.1 37.1 37.1 37.2 37.1 37.1 37.1 37 37

linear 36.1 36.1 36.2 36.2 36.2 36.2 36.2 36.1 36.1 36.1

4affine 42.5 42.5 42.5 42.5 42.5 42.4 42.4 42.4 42.3 42.3

linear 41.5 41.5 41.5 41.4 41.4 41.4 41.4 41.3 41.3 41.3

5affine 35.4 35.4 35.3 35.2 35.1 34.9 34.7 34.5 34.3 34.2

linear 34.4 34.4 34.3 34.1 34 33.8 33.6 33.4 33.1 33

6affine 34.3 34.3 34.2 34.2 34.1 34.1 34.1 34 33.9 33.9

linear 33.4 33.4 33.4 33.4 33.3 33.3 33.3 33.3 33.2 33.2

7affine 36.9 36.8 36.7 36.7 36.6 36.5 36.4 36.3 36.3 36.2

linear 35.4 35.3 35.2 35.2 35.2 35.1 35.1 35 35 34.9

Avg. 38 38 38 38 37.9 37.9 37.8 37.7 37.7 37.6

PSNR (dB)

𝐴𝑘

method

img #

overlapping W-1 overlapping W/2 superpixel

linear affine linear affine linear affine

1 46.1 47.8 46.2 47.7 45.2 46.7

2 36.6 37.9 36.7 37.7 35.9 37.0

3 36.7 37.6 36.6 37.5 36.1 37.1

4 41.9 43.0 41.8 42.9 41.3 42.3

5 35.5 36.5 35.6 36.4 33.3 34.4

6 34.3 35.3 34.4 35.4 33.3 34.0

7 36.6 38.0 36.4 37.6 35.0 36.3

8 43.0 44.8 43.0 44.6 41.9 43.3

9 39.2 40.4 39.2 40.2 38.5 39.6

10 36.9 37.9 36.9 37.7 36.1 36.9

11 40.9 42.7 40.8 42.4 39.0 40.3

12 34.6 40.9 34.5 40.8 36.0 41.113 39.5 43.2 38.9 42.5 44.0 49.814 43.7 49.6 43.4 49.0 45.6 52.715 39.3 44.6 39.1 44.1 42.1 47.816 44.4 54.9 43.5 54.4 49.0 55.7

Avg. 39.3 42.2 39.2 41.9 39.5 42.2

Rank 1 3 2

Complexity wh wh/16 wh/64

PSNR (dB)block size=8x8

Transform Recipes for Efficient Cloud Photo Enhancement• Limited computing power and battery life of

mobile devices• Cloud image processing applications which

preserve the overall content of an image• Use least time and energy cost of

uploading and downloading

(1) Generate a compressed input of the input image (2) Upload this image along with the histogram of (3) Upsample and applies histogram transfer to compute a proxy input (4) Generate a proxy output (5) Compute a compact recipe using and (6) Download the recipe(7) Apply it on the original input

• Process input with a filter to produce output • Each recipe is specific to a given input-filter pair

Image Decomposition• Multi-scale decomposition• Work in color space

• Coarsely model the chrominance transformation and sophisticatedly model the luminance transformation

• Split and into levels and • First n levels represent the details at increasingly coarser scales• Last level is the low frequency residual which affects a large area

and affect significantly in final reconstruction• Combined high-frequency data

Layer n :the low frequency residual

Layer 0~n-1 :the details at increasingly coarser scales

Combined high-frequency data +

Laplacian stack

Compute Recipes (1)• The low frequency residual part of the transformation

• Chrominance Transformations

affine function

Compute Recipes (2)• Luminance Transformations

• Affine function - brightness and contrast• Multiplicative factor to each stack level - multiscale effects• Multiplicative factor to non-linearity terms

• Segment Function

Compute Recipes (3)

affine function multiplicative factorto each stack level

multiplicative factorto non-linearity terms

Lasso Regression• Include a penalty term to constrain the size of the

coefficients

• The penalty term Pα(β) interpolates between the L1 norm of β and the squared L2 norm of β

• As λ increases, the number of nonzero components of β decreases

Reconstructing• Perform the same decomposition• Apply the corresponding recipe coefficients to each term

• Up-sample the low residual term• Linearly interpolate other terms

# segments

img #2 4 6 8 10 12 14 16 18 20

1 49.5 49.7 49.9 50.0 50.1 50.2 50.3 50.4 50.5 50.6

2 38.6 39.0 39.2 39.4 39.6 39.8 40.0 40.1 40.3 40.4

3 38.7 39.1 39.4 39.6 39.8 40.1 40.3 40.5 40.7 40.8

4 44.4 44.7 44.9 45.1 45.3 45.4 45.6 45.8 46.0 46.1

5 38.2 38.6 38.9 39.1 39.4 39.6 39.8 40.0 40.2 40.4

6 37.3 37.6 37.8 37.9 38.1 38.2 38.4 38.5 38.6 38.7

7 39.4 39.9 40.2 40.5 40.8 41.0 41.3 41.5 41.7 42.0

8 45.8 46.0 46.2 46.3 46.4 46.5 46.7 46.7 46.8 46.9

9 41.0 41.2 41.4 41.5 41.6 41.7 41.8 42.0 42.0 42.1

10 38.5 38.8 39.1 39.3 39.5 39.7 39.9 40.1 40.3 40.4

11 43.7 44.0 44.3 44.4 44.6 44.7 44.9 45.0 45.1 45.2

12 42.4 42.7 42.7 42.8 42.9 43.0 43.0 43.1 43.2 43.2

13 47.5 49.0 49.3 49.4 49.5 49.7 49.7 49.8 49.9 50.0

14 51.7 52.2 52.3 52.4 52.4 52.5 52.5 52.6 52.6 52.7

15 47.0 48.0 48.2 48.3 48.4 48.4 48.5 48.6 48.6 48.7

16 55.4 55.6 55.7 55.8 55.9 55.9 56.0 56.0 56.0 56.1

Avg. 43.7 44.1 44.3 44.5 44.6 44.8 44.9 45.0 45.2 45.3

PSNR (dB)

Comparison

0 5 10 15 20 2542.5

43.0

43.5

44.0

44.5

45.0

45.5

# segments

PSNR (dB)

# layers

img #2 3 4 5 6 7 8 9 10

1 51.1 50.0 50.1 50.3 50.5 50.7 50.8 50.9 51.0

2 40.1 39.4 39.6 39.8 40.0 40.2 40.3 40.5 40.63 40.1 39.6 39.8 40.0 40.2 40.4 40.5 40.7 40.94 45.7 45.0 45.3 45.5 45.7 45.9 46.2 46.4 46.55 39.9 39.1 39.4 39.8 40.1 40.5 40.7 41.0 41.36 39.2 37.9 38.1 38.5 38.7 38.7 38.9 39.1 39.37 41.5 40.4 40.8 41.2 41.6 41.9 42.1 42.4 42.78 48.1 46.4 46.4 46.6 46.7 46.9 47.0 47.1 47.1

9 42.5 41.5 41.6 41.8 41.9 42.0 42.1 42.2 42.3

10 40.3 39.4 39.5 39.8 40.0 40.3 40.5 40.7 40.911 46.0 44.6 44.6 44.8 45.0 45.2 45.4 45.5 45.6

12 42.9 42.8 42.9 43.0 43.0 43.1 43.1 43.2 43.213 47.0 47.5 49.5 50.9 51.5 51.7 51.8 51.8 51.914 51.6 51.4 52.4 52.8 53.0 53.1 53.1 53.1 53.215 47.7 47.6 48.4 48.7 48.8 48.9 48.9 49.0 49.016 56.4 55.8 55.9 56.0 56.1 56.2 56.2 56.3 56.3

Avg. 45.0 44.3 44.6 45.0 45.2 45.3 45.5 45.6 45.7

PSNR (dB)

Comparison

# layers

PSNR (dB)

1 2 3 4 5 6 7 8 9 10 1143.5

44.0

44.5

45.0

45.5

46.0

method

img #

overlapping W-1 overlapping W/2 superpixelLaplacian

stacklinear affine linear affine linear affine

1 46.1 47.8 46.2 47.7 45.2 46.7 50.1

2 36.6 37.9 36.7 37.7 35.9 37.0 39.6

3 36.7 37.6 36.6 37.5 36.1 37.1 39.8

4 41.9 43.0 41.8 42.9 41.3 42.3 45.3

5 35.5 36.5 35.6 36.4 33.3 34.4 39.4

6 34.3 35.3 34.4 35.4 33.3 34.0 38.1

7 36.6 38.0 36.4 37.6 35.0 36.3 40.8

8 43.0 44.8 43.0 44.6 41.9 43.3 46.4

9 39.2 40.4 39.2 40.2 38.5 39.6 41.6

10 36.9 37.9 36.9 37.7 36.1 36.9 39.5

11 40.9 42.7 40.8 42.4 39.0 40.3 44.6

12 34.6 40.9 34.5 40.8 36.0 41.1 42.9

13 39.5 43.2 38.9 42.5 44.0 49.8 49.5

14 43.7 49.6 43.4 49.0 45.6 52.7 52.4

15 39.3 44.6 39.1 44.1 42.1 47.8 48.4

16 44.4 54.9 43.5 54.4 49.0 55.7 55.9

Avg. 39.3 42.2 39.2 41.9 39.5 42.2 44.6

Rank 2 4 3 1

PSNR (dB)

• Remove the low frequency residual• Add a layer in laplacian stack and the high frequency term

Modified Laplacian Stack Method (1)

Layer 0~n-1

Layer n

Combined high-frequency data +

Modified Laplacian Stack Method (2)

Remove the Non-linear Terms

Remove the Laplacian Stack Terms

Remove the Non-linear terms and the Laplacian Stack Term

PSNR (dB) relative PSNR (dB)residual ratio v

multiscale v v v non-linear v v v

1 50.1 +0.1 -0.7 -1.4 -2.4

2 39.6 +0.2 -1.1 -0.6 -1.9

3 39.8 +0.1 -1.4 -0.8 -2.4

4 45.3 +0.2 -1.0 -1.3 -2.4

5 39.4 +0.3 -1.1 -1.7 -3.0

6 38.1 -0.5 -1.1 -2.1 -2.7

7 40.8 +0.4 -1.6 -1.0 -3.1

8 46.4 +0.0 -0.8 -1.0 -1.9

9 41.6 +0.2 -0.7 -0.5 -1.4

10 39.5 +0.3 -1.0 -0.6 -1.8

11 44.6 +0.1 -1.2 -0.8 -2.1

12 42.9 +0.1 -1.0 -0.9 -2.1

13 49.5 +2.1 -5.3 1.8 -7.0

14 52.4 +0.6 -2.6 0.4 -3.4

15 48.4 +0.4 -3.2 0.2 -4.2

16 55.9 +0.2 -0.8 -0.1 -1.5

Avg. 44.6 0.3 -1.5 -0.6 -2.7

Rank 2 1 4 3 5

method

img #

overlapping W-1 overlapping W/2 superpixelLaplacian

stack resultlinear affine linear affine linear affine

1 46.1 47.8 46.2 47.7 45.2 46.7 50.1 50.2

2 36.6 37.9 36.7 37.7 35.9 37.0 39.6 39.8

3 36.7 37.6 36.6 37.5 36.1 37.1 39.8 39.9

4 41.9 43.0 41.8 42.9 41.3 42.3 45.3 45.5

5 35.5 36.5 35.6 36.4 33.3 34.4 39.4 39.7

6 34.3 35.3 34.4 35.4 33.3 34.0 38.1 37.6

7 36.6 38.0 36.4 37.6 35.0 36.3 40.8 41.2

8 43.0 44.8 43.0 44.6 41.9 43.3 46.4 46.4

9 39.2 40.4 39.2 40.2 38.5 39.6 41.6 41.8

10 36.9 37.9 36.9 37.7 36.1 36.9 39.5 39.8

11 40.9 42.7 40.8 42.4 39.0 40.3 44.6 44.7

12 34.6 40.9 34.5 40.8 36.0 41.1 42.9 43.0

13 39.5 43.2 38.9 42.5 44.0 49.8 49.5 51.6

14 43.7 49.6 43.4 49.0 45.6 52.7 52.4 53.0

15 39.3 44.6 39.1 44.1 42.1 47.8 48.4 48.8

16 44.4 54.9 43.5 54.4 49.0 55.7 55.9 56.1

Avg. 39.3 42.2 39.2 41.9 39.5 42.2 44.6 44.9

Rank 3 5 4 2 1

PSNR (dB)

Future Work

Video color transfer• Video color transfer using local affine models

• Find approximate nearest-neighbor matches of a video to a set of reference patches in the first frame• Patch match• Ring intersection approximate nearest neighbor search

• Compute local affine models between the original first frame and the enhanced first frame in the video

• Apply the transforms of the approximate nearest-neighbor matches to patches in the video

Recipe Coefficients• Use other regression method to stabilize the local affine

model coefficients

lasso regressionpseudo inverse

RR GR BR 1RRG GG BG 1GRB GB BB 1B

Reference[1] Transform Recipes for Efficient Cloud Photo Enhancement Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, Frédo Durand SIGGRAPH ASIA 2015 [2] Data-driven Hallucination for Different Times of Day from a Single Outdoor Photo YiChang Shih, Sylvain Paris, Frédo Durand, William T. Freeman SIGGRAPH ASIA 2013[3] SLIC Superpixels Compared to State-of-the-art Superpixel Methods Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Susstrunk

Appendix

Application

Dehazing

HDR ToningColor Harmonization

Color Grading

Color Constancy Auto Colors

Application - Times of Day Hallucination

Application – Photoshop

Closed-form Solution• Solution by iterative method

• Define

analysis of local affine model v2

Technology