hm inter prediction 111022 r3
TRANSCRIPT
HEVC Inter prediction
광운대학교 영상처리시스템연구실
2011-10-22 (SAT)
Contents
• Overview of inter prediction
• Inter prediction in HEVC
– GOP coding structure
– Adaptive motion vector prediction(AMVP)
– Merge
– Asymmetric motion partition(AMP)
– Interpolation filter
OVERVIEW OF INTER PREDICTION
Overview of inter prediction
• The encoder forms a model of the current frame based on the samples of a previously transmitted frame
• Motion-compensated predicted frame is subtracted from the current frame to reduce a residual ‘error’ frame
• Transform coding of the residual frame
Current frame
Residual frame
Motion-compensated
frame
Motion estimation
Previous frame
_
Overview of inter prediction
• The goals of inter prediction
– ME creates a model of the current frame based on available data in one or more previously encoded frames to match the current frame as closely as possible
n-1 frame n frame
Overview of inter prediction
• Transmitted data
– Motion vector (PMV, MVD)
– Reference index (LIST_0/LIST_1)
– Prediction mode
– Residual data (quantized coefficients)
… …
… …
How?
GOP CODING STRUCTURE
GOP coding structure
• Temporal prediction structure
– All Intra (No temporal prediction is allowed)
– Low Delay (LD)
• The first picture shall be coded as IDR picture
• GPB (Generalized P and B) picture (on/off)
– Random access (RA)
• Hierarchical B structure shall be used for coding
• IDR Intra picture or CDR(clean random access) picture shall be inserted cyclically per about one second in random access point
GOP coding structure – Low delay
IDR or
Intra picture GPB(Generalized P and B) picture
0
1
2
4
5 3
6
7
8
time
QPI
QPBL3=QPI+3
QPBL2=QPI+2
QPBL3 QPBL3 QPBL3
QPBL2
QPBL1=QPI+1 QPBL1
GOP coding structure – Random access
IDR or
Intra picture GPB(Generalized P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B Picture
Non-referenced B Picture
8
4
1
2
3 5
6
7
0
QPI
QPBL4=QPI+4 QPBL4 QPBL4 QPBL4
QPBL3=QPI+3 QPBL3
QPBL2=QPI+2
QPBL1=QPI+1
POC
Coding
order
GOP coding structure – Random access
Variables: m_iHrchDepth = log2GOP_size + 1; iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; iNumPicRcvd = GOP_size;
for( iDpeth=0; iDepth<m_iHrchDepth; iDepth++ ) { iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; for(;iTimeOffset<=iNumPicRcvd; ) { compressSlice(); iTimeOffset += iStep; } }
IDR or
Intra picture
GPB(Generalized
P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B
Picture
Non-
referenced B
Picture
8
4
1
2
3 5
6
7
0
: Depth == 0
: Depth == 1 : Depth == 2
: Depth == 3
*uiPOCCurr = iPOCLast – (iNumPicRcvd – iTimeOffset);
AMVP (ADAPTIVE MOTION VECTOR PREDICTION)
MV prediction of H.264/AVC
• Median of each component of MV
– No transmission overhead
• Slice-based use of temporal MV predictor
C B
A
Current Block 𝑀𝑉𝑥 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑥, 𝐵𝑥, 𝐶𝑥)
𝑀𝑉𝑦 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑦 , 𝐵𝑦 , 𝐶𝑦)
Fig. Spatial neighboring block
MV prediction of HEVC
• Explicit signaling of MV predictor index
– Transmission overhead
• PU-based use of temporal MV predictor
B1
A1
B2 B0
A0
Current Block
Fig. Spatial AMVP candidates
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
AMVP
• Decoder receives
– ref_idx
– mvd
– mvp_idx
B1
A1
B2 B0
A0
Current Block
Fig. Spatial AMVP candidates
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
AMVP – Decoder side AMVP syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) | ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else { if( slice_type = = B ) { if( !entropy_coding_mode_flag ) { combined_inter_pred_ref_idx ue(v) if( combined_inter_pred_ref_idx == MaxPredRef ) inter_pred_flag[ x0 ][ y0 ] ue(v) } else inter_pred_flag[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_LC ) { if( num_ref_idx_lc_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_lc_minus4[ x0 ][ y0 ] ue(v) } else ref_idx_lc[ x0 ][ y0 ] ae(v) } mvd_lc[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_lc[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_lc[ x0 ][ y0 ] ue(v) | ae(v) }
AMVP – Decoder side AMVP syntax
else { /* Pred_L0 or Pred_BI */ if( num_ref_idx_l0_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l0[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l0[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l0[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_BI ) { if( num_ref_idx_l1_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l1_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l1[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l1[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } } } }
AMVP – Encoder side processing
1. Search for three candidates (spatial:2, temporal:1)
2. Remove redundant MVPs
3. Additional candidate list – Zero vector candidates are created by combining zero vector and
refIdx
4. Decision of best MVP before motion estimation – Distortion : SAD
– Rate: Truncated unary code (MVP index)
– RDCost = Distortion + (Bits*λ + 0.5)>>16;
5. Decision of the best MVP candidate after motion estimation – Best MVP index: smallest mvd = Best_MV – MV of mvp_idx[i]
mvp_idx bin
0 0
1 10
2 110
Starting point for ME
AMVP – Spatial AMVP candidates
• Spatial AMVP candidates
– mvLxA: Left spatial candidates
• Derivation order: A0 ⇒ A1
• First available MV
– 1st: scan without scaling (vec1, vec2)
– 2nd: scan with scaling (vec3, vec4)
– mvLxB: Above spatial candidates
• Derivation order: B0 ⇒ B1 ⇒ B2
• First available MV
– 1st: scan without scaling (vec1, vec2)
– 2nd: scan with scaling, if scaling wasn’t used before (vec3, vec4)
Fig. Spatial AMVP candidates
B1
A1
B2 B0
A0
Current Block
AMVP – Spatial AMVP candidates
• Spatial AMVP candidates
– Four candidates can be derived at each neighboring PU
• vec1: same reference index, same list
• vec2: same reference index, different list
• vec3: different reference index, same list
• vec4: different reference index, different list
time
k l mji picture id
current
block
neighboring
block b
jL0mv
mL1mv
jmvL1
imvL0 1
2
3
4
AMVP – Temporal AMVP candidate
• Temporal AMVP candidate
– Derivation order:
1. Right-bottom position of co-located PU
2. Center position of co-located PU
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
mvL1
mvL0
current picture
co-located picture
reference picture
Co-located partition
mvL1Col
AMVP - MV Scaling
• Scaling of MV predictor has been modified (JCTVC-F142)
– HM3 rounds half towards plus infinity
– Proposed scheme rounds half towards zero
HM version Modification
HM3 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 128 ≫ 8
HM4 𝑆𝑖𝑔𝑛(𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣)
× 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 127 ≫ 8
𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟: 𝑠𝑐𝑎𝑙𝑖𝑛𝑔 𝑓𝑎𝑐𝑡𝑜𝑟
MERGE
Merge
• Decoder receives
– ref_idx
– mvd
– mvp_idx
– merge_flag
– merge_index
Fig. Merge candidates
D
C B
A
E
Current Block
Co-located PU
Center
Right-bottom
Merge – Decoder side Merge skip syntax
coding_unit( x0, y0, log2CUSize ) { Descriptor if( entropy_coding_mode_flag && slice_type != I ) skip_flag[ x0 ][ y0 ] u(1) |ae(v) if( skip_flag[ x0 ][ y0 ] ) prediction_unit( x0, y0, log2CUSize, log2CUSize, 0 , 0 ) else {
… } }
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */
… } }
• Merge skip
Merge – Decoder side Merge syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) |ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else {
… } }
• General case - merge
Merge – Encoder side processing
1. Search for five candidates
– Output: Mv, RefIdx, Predflag for LIST_0/LIST_1
– S0, S1, S2, S3: Spatial candidates
– Col: Temporal candidate
2. Remove redundant candidates
3. Additional candidate list (JCTVC-F470)
– Combined bi-directional merge candidate (5 times)
– Scaled bi-directional merge candidate (1 time)
– Zero vector merge candidate
4. Decision of the best MRG candidate
S0 S1 S2 S3 Col merge_idx bin
0 0
1 10
2 110
3 1110
4 1111
Merge – Spatial merge candidates
• Spatial merge candidates (4 candidates)
– Derivation Order: A, B, C, D, E
Fig. Spatial merge candidates
D
C B
A
E
Current Block
Merge – Temporal merge candidate
• refIdx derivation for merge TMVP (JCTVC-E481)
– Decide three refIdx
• refIdxLeft: A
• refIdxAbove: B
• refIdxCorner: C or D or E
– Decide majority of them
– If three of them are not available
• refIdx = 0
– Otherwise
• Set minimum of available refIdx
• Derivation of temporal merge candidate
– Same process with TMVP
D
C B
A
E
Current Block
Merge – Temporal merge candidate
• Example) Decision of reference frame
D
C B
A
E
Current Block
Curr PU
B
A
E
ex) second 4x8 PU in 8x8 CU
Neighbor LIST RefIdx
A LIST_0 0
LIST_1 1
B LIST_0 -1
LIST_1 1
C NULL
D NULL
E LIST_0 -1
LIST_1 1
LIST_0
refIdxLeft 0
refIdxAbove -1
refIdxCorner -1
LIST_1
refIdxLeft 1
refIdxAbove 1
refIdxCorner 1
LIST_0 0
LIST_1 1
Merge – Additional cand. list
1. Combined bi-directional merge candidate (5 times)
mvL0_A(uni) mvL1_B(uni)
mvL1_B(bi)
mvL0_A(bi)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2 mvL0_A, ref 0 mvL1_B, ref 0
3
4
Cur List 0 Ref 0
List 1 Ref 0
Merge – Additional cand. list
2. Scaled bi-directional merge candidate (1 time)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL0’_A, ref 0
3
4
mvL0_A(ref 0)
Cur
mvL0’_A(ref 0)
mvL1_A(ref 1)
List 0 Ref 0
List 0 Ref 1
List 1 Ref 0
List 1 Ref 1
Merge – Additional cand. list
3. Zero vector merge
– Zero vector merge candidates are created by combining zero vector and refIdx
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3 (0,0), ref 0 (0,0), ref 0
4
AMP (ASYMMETRIC MOTION PARTITION)
Asymmetric motion partition (AMP)
• Rectangular shape PU splitting of a block for inter prediction
• AMP is used from the size of 64x64 to 16x16 CU
• AMP improves the coding efficiency, since irregular image patterns
2NxnU 2NxnD nLx2N nRx2N
Asymmetric motion partition (AMP)
Random access HE Random access LC
Y U V Y U V
Class A -0.9 -1.2 -0.9 -0.7 -0.7 -0.5
Class B -0.9 -1.0 -1.0 -0.7 -0.7 -0.6
Class C -0.9 -1.0 -1.1 -0.7 -0.9 -0.9
Class D -0.8 -1.0 -0.9 -0.5 -0.7 -0.6
Class E
Overall -0.9 -1.0 -1.0 -0.7 -0.7 -0.7
Enc Time[%] 144% 151%
Dec Time[%] 99% 99%
Low delay (B) HE Low delay (B) LC
Y U V Y U V
Class A
Class B -1.1 -1.5 -1.5 -0.9 -0.8 -0.6
Class C -1.0 -1.2 -1.3 -0.7 -0.6 -0.7
Class D -1.1 -1.3 -1.5 -0.6 -0.5 -0.9
Class E -2.3 -2.2 -2.4 -1.7 -1.1 -1.3
Overall -1.3 -1.5 -1.6 -0.9 -0.7 -0.8
Enc Time[%] 144% 150%
Dec Time[%] 99% 99%
Table. Experimental result of AMP without encoding speed-up
Random Access HE Random Access LC
Y U V Y U V
Class A -0.5 -0.8 -0.5 -0.4 -0.6 -0.2
Class B -0.5 -0.8 -0.7 -0.4 -0.5 -0.5
Class C -0.6 -0.8 -0.8 -0.5 -0.6 -0.7
Class D -0.5 -0.9 -0.8 -0.4 -0.5 -0.6
Class E
Overall -0.5 -0.8 -0.7 -0.4 -0.6 -0.5
Enc Time[%] 112% 112%
Dec Time[%] 99% 98%
Low delay B HE Low delay B LC
Y U V Y U V
Class A
Class B -0.7 -1.1 -1.2 -0.5 -0.4 -0.3
Class C -0.7 -1.0 -0.9 -0.4 -0.4 -0.7
Class D -0.7 -1.2 -0.8 -0.5 -0.7 -0.2
Class E -1.5 -1.9 -1.7 -1.0 -1.0 -0.9
Overall -0.8 -1.2 -1.1 -0.6 -0.6 -0.5
Enc Time[%] 111% 111%
Dec Time[%] 100% 99%
Table. Experimental result of AMP with encoding speed-up
INTERPOLATION FILTER
Interpolation filter of H.264/AVC
• 1/4th accuracy motion vector
– Cascaded filtering: 6-tap half-pel + bi-linear for luma
– Bi-linear for chroma (1/8th)
Integer-pel – no interpolation
Half-pel – 6-tap
Quarter-pel – 6-tap + bi-linear
Interpolation filter of HEVC
• 1/4th accuracy motion vector
– 1-pass filter: 8-tap for both 1/2nd and 1/4th pel
– 4-tap filter for chroma (1/8th)
Integer-pel – no interpolation
Half-pel – 8-tap
Quarter-pel – 8-tap
Interpolation filter
• Two modifications in HM4.0 and WD4.0
– The motion compensation process to simplify the process by removing rounding operations
– Ensure that all data after each of the vertical and horizontal filtering passes holds in 16-bit memory
– Advantage
• Software simpler
• Text simpler
• No difference in performance
Interpolation filter
• Integer samples
– Upper-case letters
• Fractional sample positions
– Lower-case letters
– For quarter sample luma interpolation
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Interpolation filter coefficients
– Luma
– Chroma
α Filter(α)
1/4 { -1, 4, -10, 57, 19, -7, 3, -1 }
1/2 { -1, 4, -11, 40, 40, -11, 4, -1 }
α Filter(α)
1/8 { -3, 60, 8, -1 }
1/4 { -4, 54, 16, -2 }
3/8 { -5, 46, 27, -4 }
1/2 { -4, 36, 36, -4 }
Interpolation filter
• Luma interpolation process (1D interpolation filter)
– For fractional positions a0,0, b0,0 and c0,0, horizontal 1D filter is used.
– For fractional positions d0,0, h0,0 and n0,0, vertical 1D filter is used.
– The input of 1D interpolation function is integer position values.
– The output is interpolated value X, which has fractional position α.
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Example) 1/2 position b0,0
– 8-tap separable DCTIF coefficient of 1/2 position
{ -1, 4, -11, 40, 40, -11, 4, -1 }
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
𝑏0,0 = −1 ∗ 𝐴−3,0 + 4 ∗ 𝐴−2,0 − 11 ∗ 𝐴−1,0 + 40 ∗ 𝐴0,0 + 40 ∗ 𝐴1,0 − 11 ∗ 𝐴2,0 + 4 ∗ 𝐴3,0 − 1 ∗ 𝐴4,0 + 32 /64
Interpolation filter
• Luma interpolation process (2D separable interpolation filter)
– For remaining positions first horizontal 1D filter is applied for extended block, and then vertical 1D filter is used.
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Example) 1/4 position e0,0
– 2D separable Interpolation
– 8×horizontal 1D filter + 1×vertical 1D filter
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• 1D filtering
• 2D filtering
– Intermediate value should be saved and processed
𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 + 𝑜𝑓𝑓𝑠𝑒𝑡1 ≫ 𝑠𝑖𝑓𝑡1
𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 ≫ 𝑠𝑖𝑓𝑡1
𝑑1𝑖,0 = −1 × 𝐴𝑖,−3 + 4 × 𝐴𝑖,−2 − 10 × 𝐴𝑖,−1 + 57 × 𝐴𝑖,0 + 19 × 𝐴𝑖,1 − 7 × 𝐴𝑖,2 + 4 × 𝐴𝑖,3 − 1 × 𝐴𝑖,4
𝑒0,0 = −1 × 𝑑1−3,0 + 4 × 𝑑1−2,0 − 10 × 𝑑1−1,0 + 57 × 𝑑10,0 + 19 × 𝑑11,0 − 7 × 𝑑12,0 + 4 × 𝑑13,0 − 1 × 𝑑14,0 + 𝑜𝑓𝑓𝑠𝑒𝑡2 ≫ 𝑠𝑖𝑓𝑡2
𝑒0,0 = −1 × 𝑎−3,0 + 4 × 𝑎−2,0 − 10 × 𝑎−1,0 + 57 × 𝑎0,0 + 19 × 𝑎1,0 − 7 × 𝑎2,0 + 4 × 𝑎3,0 − 1 × 𝑑14,0 ≫ 𝑠𝑖𝑓𝑡2
Interpolation filter – Example template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
Interpolation filter – Example. Half-pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -11 40 40 -11 4 -1
Example) 1/2 position b0,0
8-tap separable DCTIF coefficient of 1/2 position isFrist = true; isLast = true; (uni-direction case) shift = 6; offset = 1<<(6-1) = 32; maxVal = 1023(HE), 255(LC); cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e0,0
2D separable interpolation (1) Horizontal filtering isFrist = true; isLast = false; shift = 2(HE), 0(LC); offset = -(1<<7); maxVal = 0; cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e0,0
2D separable interpolation (2) Vertical filtering isFrist = false; isLast = true; shift = 10(HE), 12(LC); offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC); maxVal = 1023(HE), 255(LC); cStirde = srcStride; (vertical filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
THANK YOU