hm inter prediction 111022 r3

HEVC Inter prediction

광운대학교 영상처리시스템연구실

2011-10-22 (SAT)

Contents

• Overview of inter prediction

• Inter prediction in HEVC

– GOP coding structure

– Adaptive motion vector prediction(AMVP)

– Merge

– Asymmetric motion partition(AMP)

– Interpolation filter

OVERVIEW OF INTER PREDICTION

Overview of inter prediction

• The encoder forms a model of the current frame based on the samples of a previously transmitted frame

• Motion-compensated predicted frame is subtracted from the current frame to reduce a residual ‘error’ frame

• Transform coding of the residual frame

Current frame

Residual frame

Motion-compensated

frame

Motion estimation

Previous frame

_


• The goals of inter prediction

– ME creates a model of the current frame based on available data in one or more previously encoded frames to match the current frame as closely as possible

n-1 frame n frame


• Transmitted data

– Motion vector (PMV, MVD)

– Reference index (LIST_0/LIST_1)

– Prediction mode

– Residual data (quantized coefficients)

… …

… …

How?

GOP CODING STRUCTURE

GOP coding structure

• Temporal prediction structure

– All Intra (No temporal prediction is allowed)

– Low Delay (LD)

• The first picture shall be coded as IDR picture

• GPB (Generalized P and B) picture (on/off)

– Random access (RA)

• Hierarchical B structure shall be used for coding

• IDR Intra picture or CDR(clean random access) picture shall be inserted cyclically per about one second in random access point

GOP coding structure – Low delay

IDR or

Intra picture GPB(Generalized P and B) picture

0

1

2

4

5 3

6

7

8

time

QPI

QPBL3=QPI+3

QPBL2=QPI+2

QPBL3 QPBL3 QPBL3

QPBL2

QPBL1=QPI+1 QPBL1

GOP coding structure – Random access

IDR or

Intra picture GPB(Generalized P and B) picture

0

5

3

2

7 6

4

8

1

time

Referenced B Picture

Non-referenced B Picture

8

4

1

2

3 5

6

7

0

QPI

QPBL4=QPI+4 QPBL4 QPBL4 QPBL4

QPBL3=QPI+3 QPBL3

QPBL2=QPI+2

QPBL1=QPI+1

POC

Coding

order

GOP coding structure – Random access

Variables: m_iHrchDepth = log2GOP_size + 1; iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; iNumPicRcvd = GOP_size;

for( iDpeth=0; iDepth<m_iHrchDepth; iDepth++ ) { iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; for(;iTimeOffset<=iNumPicRcvd; ) { compressSlice(); iTimeOffset += iStep; } }

IDR or

Intra picture

GPB(Generalized

P and B) picture

0

5

3

2

7 6

4

8

1

time

Referenced B

Picture

Non-

referenced B

Picture

8

4

1

2

3 5

6

7

0

: Depth == 0

: Depth == 1 : Depth == 2

: Depth == 3

*uiPOCCurr = iPOCLast – (iNumPicRcvd – iTimeOffset);

AMVP (ADAPTIVE MOTION VECTOR PREDICTION)

MV prediction of H.264/AVC

• Median of each component of MV

– No transmission overhead

• Slice-based use of temporal MV predictor

C B

A

Current Block 𝑀𝑉𝑥 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑥, 𝐵𝑥, 𝐶𝑥)

𝑀𝑉𝑦 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑦 , 𝐵𝑦 , 𝐶𝑦)

Fig. Spatial neighboring block

MV prediction of HEVC

• Explicit signaling of MV predictor index

– Transmission overhead

• PU-based use of temporal MV predictor

B1

A1

B2 B0

A0

Current Block

Fig. Spatial AMVP candidates

Co-located PU

Center

Right-bottom

Fig. Temporal AMVP candidates

AMVP

• Decoder receives

– ref_idx

– mvd

– mvp_idx

B1

A1

B2 B0

A0

Current Block


Co-located PU

Center

Right-bottom


AMVP – Decoder side AMVP syntax

prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else if( PredMode = = MODE_INTRA ) {

… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) | ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else { if( slice_type = = B ) { if( !entropy_coding_mode_flag ) { combined_inter_pred_ref_idx ue(v) if( combined_inter_pred_ref_idx == MaxPredRef ) inter_pred_flag[ x0 ][ y0 ] ue(v) } else inter_pred_flag[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_LC ) { if( num_ref_idx_lc_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_lc_minus4[ x0 ][ y0 ] ue(v) } else ref_idx_lc[ x0 ][ y0 ] ae(v) } mvd_lc[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_lc[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_lc[ x0 ][ y0 ] ue(v) | ae(v) }

AMVP – Decoder side AMVP syntax

else { /* Pred_L0 or Pred_BI */ if( num_ref_idx_l0_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l0[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l0[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l0[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_BI ) { if( num_ref_idx_l1_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l1_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l1[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l1[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } } } }

AMVP – Encoder side processing

1. Search for three candidates (spatial:2, temporal:1)

2. Remove redundant MVPs

3. Additional candidate list – Zero vector candidates are created by combining zero vector and

refIdx

4. Decision of best MVP before motion estimation – Distortion : SAD

– Rate: Truncated unary code (MVP index)

– RDCost = Distortion + (Bits*λ + 0.5)>>16;

5. Decision of the best MVP candidate after motion estimation – Best MVP index: smallest mvd = Best_MV – MV of mvp_idx[i]

mvp_idx bin

0 0

1 10

2 110

Starting point for ME

AMVP – Spatial AMVP candidates

• Spatial AMVP candidates

– mvLxA: Left spatial candidates

• Derivation order: A0 ⇒ A1

• First available MV

– 1st: scan without scaling (vec1, vec2)

– 2nd: scan with scaling (vec3, vec4)

– mvLxB: Above spatial candidates

• Derivation order: B0 ⇒ B1 ⇒ B2

• First available MV

– 1st: scan without scaling (vec1, vec2)

– 2nd: scan with scaling, if scaling wasn’t used before (vec3, vec4)


B1

A1

B2 B0

A0

Current Block

AMVP – Spatial AMVP candidates

• Spatial AMVP candidates

– Four candidates can be derived at each neighboring PU

• vec1: same reference index, same list

• vec2: same reference index, different list

• vec3: different reference index, same list

• vec4: different reference index, different list

time

k l mji picture id

current

block

neighboring

block b

jL0mv

mL1mv

jmvL1

imvL0 1

2

3

4

AMVP – Temporal AMVP candidate

• Temporal AMVP candidate

– Derivation order:

1. Right-bottom position of co-located PU

2. Center position of co-located PU

Co-located PU

Center

Right-bottom


mvL1

mvL0

current picture

co-located picture

reference picture

Co-located partition

mvL1Col

AMVP - MV Scaling

• Scaling of MV predictor has been modified (JCTVC-F142)

– HM3 rounds half towards plus infinity

– Proposed scheme rounds half towards zero

HM version Modification

HM3 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 128 ≫ 8

HM4 𝑆𝑖𝑔𝑛(𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣)

× 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 127 ≫ 8

𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟: 𝑠𝑐𝑎𝑙𝑖𝑛𝑔 𝑓𝑎𝑐𝑡𝑜𝑟

Merge

• Decoder receives

– ref_idx

– mvd

– mvp_idx

– merge_flag

– merge_index

Fig. Merge candidates

D

C B

A

E

Current Block

Co-located PU

Center

Right-bottom

Merge – Decoder side Merge skip syntax

coding_unit( x0, y0, log2CUSize ) { Descriptor if( entropy_coding_mode_flag && slice_type != I ) skip_flag[ x0 ][ y0 ] u(1) |ae(v) if( skip_flag[ x0 ][ y0 ] ) prediction_unit( x0, y0, log2CUSize, log2CUSize, 0 , 0 ) else {

… } }

prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {

… } else { /* MODE_INTER */

… } }

• Merge skip

Merge – Decoder side Merge syntax

prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {

… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) |ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else {

… } }

• General case - merge

Merge – Encoder side processing

1. Search for five candidates

– Output: Mv, RefIdx, Predflag for LIST_0/LIST_1

– S0, S1, S2, S3: Spatial candidates

– Col: Temporal candidate

2. Remove redundant candidates

3. Additional candidate list (JCTVC-F470)

– Combined bi-directional merge candidate (5 times)

– Scaled bi-directional merge candidate (1 time)

– Zero vector merge candidate

4. Decision of the best MRG candidate

S0 S1 S2 S3 Col merge_idx bin

0 0

1 10

2 110

3 1110

4 1111

Merge – Spatial merge candidates

• Spatial merge candidates (4 candidates)

– Derivation Order: A, B, C, D, E

Fig. Spatial merge candidates

D

C B

A

E

Current Block

Merge – Temporal merge candidate

• refIdx derivation for merge TMVP (JCTVC-E481)

– Decide three refIdx

• refIdxLeft: A

• refIdxAbove: B

• refIdxCorner: C or D or E

– Decide majority of them

– If three of them are not available

• refIdx = 0

– Otherwise

• Set minimum of available refIdx

• Derivation of temporal merge candidate

– Same process with TMVP

D

C B

A

E

Current Block

Merge – Temporal merge candidate

• Example) Decision of reference frame

D

C B

A

E

Current Block

Curr PU

B

A

E

ex) second 4x8 PU in 8x8 CU

Neighbor LIST RefIdx

A LIST_0 0

LIST_1 1

B LIST_0 -1

LIST_1 1

C NULL

D NULL

E LIST_0 -1

LIST_1 1

LIST_0

refIdxLeft 0

refIdxAbove -1

refIdxCorner -1

LIST_1

refIdxLeft 1

refIdxAbove 1

refIdxCorner 1

LIST_0 0

LIST_1 1

Merge – Additional cand. list

1. Combined bi-directional merge candidate (5 times)

mvL0_A(uni) mvL1_B(uni)

mvL1_B(bi)

mvL0_A(bi)

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_B, ref 0

2

3

4

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_B, ref 0

2 mvL0_A, ref 0 mvL1_B, ref 0

3

4

Cur List 0 Ref 0

List 1 Ref 0


2. Scaled bi-directional merge candidate (1 time)

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_A, ref 1

2

3

4

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_A, ref 1

2 mvL0_A, ref 0 mvL0’_A, ref 0

3

4

mvL0_A(ref 0)

Cur

mvL0’_A(ref 0)

mvL1_A(ref 1)

List 0 Ref 0

List 0 Ref 1

List 1 Ref 0

List 1 Ref 1


3. Zero vector merge

– Zero vector merge candidates are created by combining zero vector and refIdx

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_A, ref 1

2 mvL0_A, ref 0 mvL1_A, ref 1

3

4

Merge L0 L1

0 mvL0_A, ref 0 -

1 - mvL1_A, ref 1

2 mvL0_A, ref 0 mvL1_A, ref 1

3 (0,0), ref 0 (0,0), ref 0

4

AMP (ASYMMETRIC MOTION PARTITION)

Asymmetric motion partition (AMP)

• Rectangular shape PU splitting of a block for inter prediction

• AMP is used from the size of 64x64 to 16x16 CU

• AMP improves the coding efficiency, since irregular image patterns

2NxnU 2NxnD nLx2N nRx2N

Asymmetric motion partition (AMP)

Random access HE Random access LC

Y U V Y U V

Class A -0.9 -1.2 -0.9 -0.7 -0.7 -0.5

Class B -0.9 -1.0 -1.0 -0.7 -0.7 -0.6

Class C -0.9 -1.0 -1.1 -0.7 -0.9 -0.9

Class D -0.8 -1.0 -0.9 -0.5 -0.7 -0.6

Class E

Overall -0.9 -1.0 -1.0 -0.7 -0.7 -0.7

Enc Time[%] 144% 151%

Dec Time[%] 99% 99%

Low delay (B) HE Low delay (B) LC

Y U V Y U V

Class A

Class B -1.1 -1.5 -1.5 -0.9 -0.8 -0.6

Class C -1.0 -1.2 -1.3 -0.7 -0.6 -0.7

Class D -1.1 -1.3 -1.5 -0.6 -0.5 -0.9

Class E -2.3 -2.2 -2.4 -1.7 -1.1 -1.3

Overall -1.3 -1.5 -1.6 -0.9 -0.7 -0.8

Enc Time[%] 144% 150%

Dec Time[%] 99% 99%

Table. Experimental result of AMP without encoding speed-up

Random Access HE Random Access LC

Y U V Y U V

Class A -0.5 -0.8 -0.5 -0.4 -0.6 -0.2

Class B -0.5 -0.8 -0.7 -0.4 -0.5 -0.5

Class C -0.6 -0.8 -0.8 -0.5 -0.6 -0.7

Class D -0.5 -0.9 -0.8 -0.4 -0.5 -0.6

Class E

Overall -0.5 -0.8 -0.7 -0.4 -0.6 -0.5

Enc Time[%] 112% 112%

Dec Time[%] 99% 98%

Low delay B HE Low delay B LC

Y U V Y U V

Class A

Class B -0.7 -1.1 -1.2 -0.5 -0.4 -0.3

Class C -0.7 -1.0 -0.9 -0.4 -0.4 -0.7

Class D -0.7 -1.2 -0.8 -0.5 -0.7 -0.2

Class E -1.5 -1.9 -1.7 -1.0 -1.0 -0.9

Overall -0.8 -1.2 -1.1 -0.6 -0.6 -0.5

Enc Time[%] 111% 111%

Dec Time[%] 100% 99%

Table. Experimental result of AMP with encoding speed-up

INTERPOLATION FILTER

Interpolation filter of H.264/AVC

• 1/4th accuracy motion vector

– Cascaded filtering: 6-tap half-pel + bi-linear for luma

– Bi-linear for chroma (1/8th)

Integer-pel – no interpolation

Half-pel – 6-tap

Quarter-pel – 6-tap + bi-linear

Interpolation filter of HEVC

• 1/4th accuracy motion vector

– 1-pass filter: 8-tap for both 1/2nd and 1/4th pel

– 4-tap filter for chroma (1/8th)

Integer-pel – no interpolation

Half-pel – 8-tap

Quarter-pel – 8-tap

Interpolation filter

• Two modifications in HM4.0 and WD4.0

– The motion compensation process to simplify the process by removing rounding operations

– Ensure that all data after each of the vertical and horizontal filtering passes holds in 16-bit memory

– Advantage

• Software simpler

• Text simpler

• No difference in performance


• Integer samples

– Upper-case letters

• Fractional sample positions

– Lower-case letters

– For quarter sample luma interpolation

A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1

A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0

d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0

h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0

n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0

A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1

A-1,2 A0,2 A1,2 A2,2


• Interpolation filter coefficients

– Luma

– Chroma

α Filter(α)

1/4 { -1, 4, -10, 57, 19, -7, 3, -1 }

1/2 { -1, 4, -11, 40, 40, -11, 4, -1 }

α Filter(α)

1/8 { -3, 60, 8, -1 }

1/4 { -4, 54, 16, -2 }

3/8 { -5, 46, 27, -4 }

1/2 { -4, 36, 36, -4 }


• Luma interpolation process (1D interpolation filter)

– For fractional positions a0,0, b0,0 and c0,0, horizontal 1D filter is used.

– For fractional positions d0,0, h0,0 and n0,0, vertical 1D filter is used.

– The input of 1D interpolation function is integer position values.

– The output is interpolated value X, which has fractional position α.

A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1

A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0

d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0

h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0

n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0

A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1

A-1,2 A0,2 A1,2 A2,2


• Example) 1/2 position b0,0

– 8-tap separable DCTIF coefficient of 1/2 position

{ -1, 4, -11, 40, 40, -11, 4, -1 }

A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1

A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0

d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0

h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0

n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0

A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1

A-1,2 A0,2 A1,2 A2,2

𝑏0,0 = −1 ∗ 𝐴−3,0 + 4 ∗ 𝐴−2,0 − 11 ∗ 𝐴−1,0 + 40 ∗ 𝐴0,0 + 40 ∗ 𝐴1,0 − 11 ∗ 𝐴2,0 + 4 ∗ 𝐴3,0 − 1 ∗ 𝐴4,0 + 32 /64


• Luma interpolation process (2D separable interpolation filter)

– For remaining positions first horizontal 1D filter is applied for extended block, and then vertical 1D filter is used.

A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1

A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0

d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0

h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0

n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0

A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1

A-1,2 A0,2 A1,2 A2,2


• Example) 1/4 position e0,0

– 2D separable Interpolation

– 8×horizontal 1D filter + 1×vertical 1D filter

A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1

A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0

d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0

h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0

n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0

A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1

A-1,2 A0,2 A1,2 A2,2


• 1D filtering

• 2D filtering

– Intermediate value should be saved and processed

𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 + 𝑜𝑓𝑓𝑠𝑒𝑡1 ≫ 𝑠𝑕𝑖𝑓𝑡1

𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 ≫ 𝑠𝑕𝑖𝑓𝑡1

𝑑1𝑖,0 = −1 × 𝐴𝑖,−3 + 4 × 𝐴𝑖,−2 − 10 × 𝐴𝑖,−1 + 57 × 𝐴𝑖,0 + 19 × 𝐴𝑖,1 − 7 × 𝐴𝑖,2 + 4 × 𝐴𝑖,3 − 1 × 𝐴𝑖,4

𝑒0,0 = −1 × 𝑑1−3,0 + 4 × 𝑑1−2,0 − 10 × 𝑑1−1,0 + 57 × 𝑑10,0 + 19 × 𝑑11,0 − 7 × 𝑑12,0 + 4 × 𝑑13,0 − 1 × 𝑑14,0 + 𝑜𝑓𝑓𝑠𝑒𝑡2 ≫ 𝑠𝑕𝑖𝑓𝑡2

𝑒0,0 = −1 × 𝑎−3,0 + 4 × 𝑎−2,0 − 10 × 𝑎−1,0 + 57 × 𝑎0,0 + 19 × 𝑎1,0 − 7 × 𝑎2,0 + 4 × 𝑎3,0 − 1 × 𝑑14,0 ≫ 𝑠𝑕𝑖𝑓𝑡2

Interpolation filter – Example template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar

Interpolation filter – Example. Half-pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar

-1 4 -11 40 40 -11 4 -1

Example) 1/2 position b0,0

8-tap separable DCTIF coefficient of 1/2 position isFrist = true; isLast = true; (uni-direction case) shift = 6; offset = 1<<(6-1) = 32; maxVal = 1023(HE), 255(LC); cStirde = 1; (horizontal filtering)

A0,0 a0,0 b0,0 c0,0 A1,0

d0,0 e0,0 f0,0 g0,0 d1,0

h0,0 i0,0 j0,0 k0,0 h1,0

n0,0 p0,0 q0,0 r0,0 n1,0

A0,1 A0,1 b0,1 c0,1 A1,1

Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar

-1 4 -10 57 19 -7 3 -1

Example) 1/4 position e0,0

2D separable interpolation (1) Horizontal filtering isFrist = true; isLast = false; shift = 2(HE), 0(LC); offset = -(1<<7); maxVal = 0; cStirde = 1; (horizontal filtering)

A0,0 a0,0 b0,0 c0,0 A1,0

d0,0 e0,0 f0,0 g0,0 d1,0

h0,0 i0,0 j0,0 k0,0 h1,0

n0,0 p0,0 q0,0 r0,0 n1,0

A0,1 A0,1 b0,1 c0,1 A1,1

Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar

-1 4 -10 57 19 -7 3 -1

Example) 1/4 position e0,0

2D separable interpolation (2) Vertical filtering isFrist = false; isLast = true; shift = 10(HE), 12(LC); offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC); maxVal = 1023(HE), 255(LC); cStirde = srcStride; (vertical filtering)

A0,0 a0,0 b0,0 c0,0 A1,0

d0,0 e0,0 f0,0 g0,0 d1,0

h0,0 i0,0 j0,0 k0,0 h1,0

n0,0 p0,0 q0,0 r0,0 n1,0

A0,1 A0,1 b0,1 c0,1 A1,1

THANK YOU

hm inter prediction 111022 r3

Documents