使用於分散式視訊編碼之非區域平均去雜訊循序旁資訊改善技術

Progressive Side Information Refinement with Non-Local Means

Denoising in Distributed Video Coding

使用於分散式視訊編碼之非區域平均去雜訊循序旁資訊改善技術Wang, Pin-Hsiang 王品翔

Advisor: Prof. Wu, Ja-Ling 吳家麟教授2011/10/13

CMLab, CSIE, NTU

Introduction and Motivation DVC Architecture Overview Proposed Side Information Refinement

Framework Experimental Result Conclusions and Future Work

Outline2

CMLab, CSIE, NTU

Emerging application

CMLab, CSIE, NTU

3

mobile cameras phone

Wireless sensor network

Video surveillance

mobile video conference

Requiring low complexity and power-efficient encoder…

4

CMLab, CSIE, NTU

Emerging application

Conventional video coding (e.g. H.264/AVC, MPEG-2)- Inherent high complexity encoder, low complexity decoder

Requiring low complexity and power-efficient encoder…

Distributed video coding (DVC)- New video coding paradigm shifts complexity from encoder to decoder

Application of DVC5

DVC to H.264 Transcoder

CloudComputational Resource

DVC encoder(Low Complexity)

H.264 decoder(Low Complexity)

DVC encoded bitstream

H.264 encoded bitstream

CMLab, CSIE, NTU

Make the Clients slimmer & thinner

Distributed Video Coding

𝑺𝒐𝒖𝒓𝒄𝒆 𝑿

𝑺𝒐𝒖𝒓𝒄𝒆𝒀

Joint Encoder

Joint Decoder

𝑿𝒀

𝑹𝑿+𝑹𝒀 ≥𝑯 (𝑿 ,𝒀 ) Statistical dependency

𝑺𝒐𝒖𝒓𝒄𝒆 𝑿


Encoder X

Joint Decoder

𝑿𝒀

𝑹𝑿+𝑹𝒀 ≥𝑯 (𝑿 ,𝒀 ) Statistical dependency is not exploited

Encoder Y

Conventional video coding paradigm

Slepian-Wolf theorem

CMLab, CSIE, NTU

Slepian-Wolf Theorem (1973, Lossless coding)

Wyner-Ziv Theorem (1976, Lossy coding)

6

Distributed Video Coding7

SourceEncoder

SourceEncoder

Quantizer𝑺𝒐𝒖𝒓𝒄𝒆 𝑿


SourceEncoder 𝑿

𝒀

Statistical dependency is not exploited

Quantizer

SourceEncoder

Correlation is exploited at the decoder side

Joint DecoderEncoder X

Encoder Y

Parity bitsChannelEncoder

ChannelDecoder

side information (SI)

Side Information Estimation

𝒀

Virtual channel

CMLab, CSIE, NTU

DVC is also called Wyner-Ziv video coding (WZVC)

Corrupted version of X

Wyner-Ziv Theorem (1976, Lossy coding)

Quality

Motivation8

CMLab, CSIE, NTU

Past reference frame Future reference frame

Source XEncoder-side

Decoder-side

F( t-1 ) F( t+1 )F( t )

Side Information Estimation

F( t )

Motivation9

CMLab, CSIE, NTU



Decoder-side

F( t-1 ) F( t+1 )F( t )

Frame interpolation (Decoder-side ME)

F( t )

Motivation10

CMLab, CSIE, NTU



Decoder-side

F( t-1 ) F( t+1 )F( t )

Side information

F( t )

Motivation11

CMLab, CSIE, NTU



Decoder-side

F( t-1 ) F( t+1 )F( t )

Side informationLimitation

F( t )

Most reported WZ codecs have a poor RD performance for high motion and large GOP size sequences

Non-Local Means Side Information Refinement framework (NLM-SIR) for DVC is proposed

NLM-SIR framework for DVC

12

CMLab, CSIE, NTU

Improve the SI quality to better rate-distortion (RD) performance of WZVC

Overcoming some of the limitations about current SI estimation methods in WZVC



Outline13

CMLab, CSIE, NTU

Reference codec : DISCOVER codec (Distributed coding for video

services) X. Artigas et al., “The DISCOVER codec: architecture,

techniques and evaluation”, PCS, 2007 Feedback channel based transform domain WZ

codec

DVC Architecture Overview14

CMLab, CSIE, NTU

WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT

LDPCA Encoder Buffer

CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT

Correlation Noise Modeling

H.264/AVC Intra Encoder

H.264/AVC Intra Decoder

Frame Buffer

Side Information Creation

DCT

CRC-8

Feedback Channel

WZ Bitstream

Slepian-Wolf Encoder Slepian-Wolf Decoder

WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

DISCOVER Codec Architecture

Soft Input

KeyFrame

KeyFrame

WZFrame

GOP size 2

WZFrame

WZFrame

WZFrame

GOP size 4

KeyFrame

KeyFrame

Ref. X. Artigas et al., PCS, 2007

15

Quantization16

DCT coefficients bands

DCT coefficient band b1 : { S11, S2

1, S31, …SN

1 }


2, S32, …SN

2 }


16, S316, …SN

16 }

…DC band

AC bands

Block1

S11 S1

2 S16 S1

7

S13 S1

5 S18 S1

13

S14 S1

9 S112 S1

14

S110 S1

11 S115 S1

16

Block2

S21 S2

2 S26 S2

7

S23 S2

5 S28 S2

13

S24 S2

9 S212 S2

14

S210 S2

11 S215 S2

16

Block3

S31 S3

2 S36 S3

7

S33 S3

5 S38 S3

13

S34 S3

9 S312 S3

14

S310 S3

11 S315 S3

16

…


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input


17

Bit plane Extraction18

0 0 1 0 0 0 0 0 0 1

0 0 0 0 0 1 1 1 1 0

Bit planes of DC band:

Bit plane 1:

Bit plane 2:

Bit plane 3:

Bit plane 4:

Bit plane 5:

Independently Channel Encode (LDPCA)

4 6

7

0 6

3

1 7

7

30 1

5

For each DCT coefficient band…

MSB

LSBZig-zag order


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input


19


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input

Forward motion estimation


20


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input

Bidirectional motion estimation & compensation


21


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input


22


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input


Laplacian distribution

23


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft InputConditional bit probabilities


Iterative decoding (band by band, bitplane by bitplane)

24

𝑧𝑖+1

Reconstruction25

CMLab, CSIE, NTU

: Side Information

: Reconstructed value

𝑧𝑖

Quantization Interval

Boundary reconstruction method :

Case1 Case2 Case3


WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Soft Input

Poor RD performance for high motion and large GOP size sequences

Room for improvement


26



Outline27

CMLab, CSIE, NTU

NLM-SIR framework28

Original XWZ frame (at the encoder)

Side information YWZ (at the decoder)

Correlation noise N between Original XWZ frame and side information

Observed model : Reduce noise to achieve better SI quality Denoising problem

NLM-SIR framework29

Iterative decodingIterative decoding (band by band)

New information about original WZ frame is not exploited Progressive available during decoding Not available at the time the initial SI was estimated

WZ Encoder WZ Decoder

Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ

Proposed Codec Architecture

30


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


Initial SI (YWZ) is always used to decoding the DC band

DC band decoding

31


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


AC bands decoding

32


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


Other decoding iterations

NLM-SIR

Partially decoded WZ frame

Refined Side Information

33


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ



Decoded coefficients

Copy from Initial SI coefficients

34


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ



35

𝐸𝑛=∑𝑥=0

3

∑𝑦=0

3

(𝑌 𝑛 [ 𝑥 , 𝑦 ]−𝑃𝑛 [ 𝑥 , 𝑦 ] )2

𝐸𝑛≥ h h𝑇 𝑟𝑒𝑠 𝑜𝑙𝑑𝐸𝑛< h h𝑇 𝑟𝑒𝑠 𝑜𝑙𝑑Fine SI blocks :

Noisy SI blocks :

Candidate Block Selection

(1) Noise Computation :

(2) Block Selection for Refinement:

Noise indicator

Selection of the SI blocks which are worthwhile of refining Initial guess by the side information creation

process has basically failed

36


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ



37

Non-Local Means Images possess repeating elements and

patches

38

Non-Local Means Refinement

CMLab, CSIE, NTU

39

Partially decoded frame PWZ

𝐷(𝑥 , 𝑦 ) [𝑚 ,𝑛 ]= ∑𝜇 ,𝑣=¿𝑑𝑚 ,… ,−𝑑𝑚

(𝑃 [ 𝑥+𝜇 , 𝑦+𝑣 ]−𝑃 [𝑚+𝜇 ,𝑛+𝑣 ] )2

𝑌 𝑤𝑧 [ 𝑥 , 𝑦 ]= ∑[𝑚 ,𝑛 ]∈𝑆

𝑊 (𝑥 , 𝑦 ) [𝑚 ,𝑛 ] ∙𝑃 [𝑚 ,𝑛 ]

NLM algorithm :

Similarity measurement


40

Smoothing parameter h :


𝑊 (𝑥 ,𝑦 ) [𝑚 ,𝑛 ]= 1𝑁 [ 𝑥 , 𝑦 ]

∙𝑒𝑥𝑝(− 𝐷(𝑥 ,𝑦 ) [𝑚 ,𝑛 ]h2 )

Normalized term :

Weight assignment :


41


𝑊 (𝑥 ,𝑦 ) [𝑚 ,𝑛 ]= 1𝑁 [ 𝑥 , 𝑦 ]

∙𝑒𝑥𝑝(− 𝐷(𝑥 ,𝑦 ) [𝑚 ,𝑛 ]h2 )

Assigned the same value as themaximum of the other weightsobserved in the searching window

Weight assignment :

, Central weight :


42

𝐹 (𝑡−1 ) 𝐹 (𝑡 ) 𝐹 (𝑡+1 )

Past decoded frame Future decoded framePartially decoded frame

Searching for more similar patches…


43

Take into account of the temporal similar patches

𝑌 𝑤𝑧 [ 𝑥 , 𝑦 ]=∑𝑡

∑[𝑚 ,𝑛 ]∈𝑆

𝑊 (𝑥 , 𝑦 ) [𝑚 ,𝑛 ,𝑡 ] ∙𝑅𝑒𝑓 [𝑚 ,𝑛 , 𝑡 ]

𝐹 (𝑡−1 ) 𝐹 (𝑡 ) 𝐹 (𝑡+1 )


Parameter Setting of NLM44

𝐹 (𝑡−1 ) 𝐹 (𝑡 ) 𝐹 (𝑡+1 )


Neighborhood size (Patch size) : Search window size :

Parameter Setting of NLM

CMLab, CSIE, NTU

45

X’F

X’BMC Residue frame

Motion compensated residual frame

Correlation noise

𝑀𝐶𝑅𝐹 [ 𝑥 , 𝑦 ]=12 (𝑋 ¿¿𝑃 [𝑥+𝑚𝑣𝑥𝑃 , 𝑦+𝑚𝑣 𝑦𝑃 ]−𝑋 𝐹 [𝑥+𝑚𝑣 𝑥𝐹 , 𝑦+𝑚𝑣 𝑦𝐹 ])¿

Smoothing parameter : First iteration : Afterward iterations :


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


Other decoding iterationsUpdate Noise Distribution Model :

Y ’(u,v) Y(u,v)

X(u,v)

Prob

abilit

y

Coefficient Value

Refined SI coefficient

46


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


Other decoding iterationsFailed refinement detection

(q+1) W‧ kq‧Wk

Quantization bin

refined SIrefined SI unrefined SI

47


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


Progressive refinement

# of decoded bands

PSNR

, dB

SI quality of 13th frame in Foreman

48


Soft InputY’

WZ

YDCT

XDCT

X’F

X’P

XK

XWZ

YWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Reconstruction IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

NLM Refinement

Candidate Block

Selection

PWZ


NLM-SIR

Bitrate savings

Quality gains

49



Outline50

CMLab, CSIE, NTU

Test sequences :

QCIF, 15Hz, all frames GOP Size 2, 4 and 8 Only luminance component is used

Test materials

CMLab, CSIE, NTU

51

Soccer Foreman Coastguard Hall Monitor

Motion: High Low

PSNR Temporal EvolutionAvg : 1.50 dBMax : 3.12 dB

Avg : 0.88 dBMax : 1.91 dB

Avg : 0.42 dBMax : 2.05 dB

Avg : 0.13 dBMax : 0.67 dB

Achieving better decoded quality, especially for high motion content zones

Scenario : GOP = 8, Q8

52

Bitrate Temporal EvolutionAvg : 5.58 kbitsMax : 13.90 kbits

Avg : 4.78 kbitsMax : 9.36 kbits



Scenario : GOP = 8, Q8

53

Overall RD Performance (GOP=8)

2.5 dB 2.0 dB

0.6 dB 0.3 dB

54

Overall RD Performance (GOP=4) 55

Decoding Time Complexity57

Foreman Soccer Coastguard Hall0

2000

4000

6000

8000

10000

12000

14000

16000

10872 10772 11243

2298

1548914755

11340

2520

With NLM-SIRWithout NLM-SIR

Tota

l dec

odin

g ti

me

(Sec

.)

Q8, GOP = 8, whole sequence (150 frames)

There tests were performed on an Intel Core2 Quad Processor at 2.40 GHz with 4.0 GB of RAM (Windows 7 operating system)

Usage of better SI results in fewer rate requests and thus fewer LDPCA decoder runs



1000

2000

3000

4000

5000

6000

7000

49565382

3836

975

5485

6457

3521

899

With NLM-SIRWithout NLM-SIR

Tota

l dec

odin

g ti

me

(Sec

.)


There tests were performed on an Intel Core2 Quad Processor at 2.40 GHz with 4.0 GB of RAM (Windows 7 operating system)

Highly parallelized decoder

DISPAC codec (Distributed video coding with parallelized design for cloud computing)

NLM-SIR for DISPAC Codec59

CMLab, CSIE, NTU

State-of-the-art RD performance - Effective integrate numerous advanced tools



2000

4000

6000

8000

10000

12000

14000

16000

10872 10772 11243

2298

315 353 338 200319 350 352 202

DISCOVERDISPACDISPAC with NLM-SIR

Tota

l dec

odin

g ti

me

(Sec

.)


4core + GPU

CMLab, CSIE, NTUThere tests were performed on an Intel Core2 Quad Processor at 2.40 GHz with 4.0 GB of RAM (Windows 7 operating system)

The processing of NLM-SIR can be highly parallelized (CUDA)



Outline62

CMLab, CSIE, NTU

A NLM-based side information refinement (NLM-SIR) framework for WZVC is proposed

Conclusion63

CMLab, CSIE, NTU

Universally applicable in most DCT domain WZVC schemes

Provide significant RD gain over existing WZVC framework, notably for the conditions where usually WZVC performs worse

Introduce negligible overhead on the decoding time, and the processing module can be highly parallelized

Future work64

CMLab, CSIE, NTU

Spatial adaptive parameter setting of NLM-SIR during the decoding

A more suitable and powerful transform-domain denoising algorithm could be considered to substitute for NLM

Thank You

CMLab, CSIE, NTU

65

DISPAC codec with NLM-SIR

WZ Encoder

XDCT

YDCT

X’F

X’P

XK

XWZYWZ

Uniform Quantizer

DCT


CRC Gen

LDPCA Decoder

CRC Check

Multi-SI Reconstruction

IDCT




Frame Buffer


DCT

CRC-8

Feedback Channel

WZ Bitstream


WZ Frames

Key Frames

Decoded WZ

Frames

Decoded Key

Frames

Bitplanes

WZ Decoder

Block Mode Selection

Soft Input

Deblocking Filter

Motion Learning SI Refinement

Non-Local Means SI Refinement

PWZY’

WZ

Y’’WZ

66

使用 於分散式視訊編碼之非區域平均去雜訊循序旁資訊改善技術

Documents

使用於分散式視訊編碼之非區域平均去雜訊循序旁資訊改善技術