fast algorithm for dct
TRANSCRIPT
-
8/12/2019 Fast Algorithm for DCT
1/40
Graduate Institute of Photonics and Optoelectronics Engineering
College of Electrical Engineering and Computer Science
National Taiwan University
inal Term Paper !Tutorial"
ast #lgorithms for $iscrete Cosine and Sine Transforms
I$ %&&&'((((
Chung)*ei +uang
!"
#dvisor, -ian)-iun $ing. Ph/$/
#$% (00& 1'
-une. 20((
-
8/12/2019 Fast Algorithm for DCT
2/40
ABSTRACT
or the $CT and $ST to 3e practical. the fast algorithms for their efficient
implementation in terms of reduced memory. implementation comple4ity and
recursively are essential/ The fast algorithms for one)dimensional $CTs and $STs are
the main points in this term paper/ In Chapter 2. the definitions. properties of the
relations 3etween $CTs and $STs are the first presented. followed 3y presentation of
the clear forms of orthonormal $CT and $ST matrices for N 5 2. ' and 6/ The fast ()$
rotation)3ased algorithms for the computation of $CTs and $STs 3ased on the
!recursive" sparse matri4 of the corresponding $CT and $ST matrices and represented
3y the generali7ed signal flow graphs are discussed in Chapter 8/ The matri4 reveals
various interrelations 3etween different versions of the $CT and $ST/ These selected
fast algorithms are very convenient in constructing integer appro4imations of $CTs and
$STs/
i
-
8/12/2019 Fast Algorithm for DCT
3/40
CONTENTS
#9ST%#CT/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// i
CONTENTS//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////ii
Chapter 1 Introduction..............................................................................................1
Chapter 2 Fast DCT/DST Alorith!s......................................................................2
2/( The definition and relations of $CT and $ST matrices/////////////////////////////////2
2/2 The e4plicit forms of $CT:$ST matrices//////////////////////////////////////////////////////;
Chapter " The #ast rotation$%ased DCT/DST alorith!s....................................1&
8/( The fast $CT)I and SCT algorithms/////////////////////////////////////////////////////////////(0
8/(/( $CT)I computation 3ased on the SCT algorithm for 2 (mN= + //////(0
8/(/2 $CT)I recursive sparse matri4 factori7ations//////////////////////////////////////(8
8/(/8 The split)radi4 $CT)I algorithm/////////////////////////////////////////////////////////('
8/2 The fast $ST)I and SST algorithms//////////////////////////////////////////////////////////////(1
8/2/( $ST)I computation 3ased on the SST algorithm for 2 (mN= + ////////(1
8/2/2 $ST)I recursive sparse matri4 factori7ations//////////////////////////////////////(6
8/2/8 The split)radi4 $ST)I algorithm//////////////////////////////////////////////////////////20
8/8 The fast $CT)II:$ST)II and $CT)III:$ST)III algorithms////////////////////////////22
8/' The fast $CT)I
-
8/12/2019 Fast Algorithm for DCT
4/40
Introduction
$iscrete cosine transforms !$CTs" and discrete sine transforms !$STs" are
mem3ers of the class of sinusoidal unitary transforms/ # sinusoidal unitary transform is
an inverti3le linear transform whose =ernel is defined 3y a set of complete. orthogonal
or orthonormal discrete cosine and sine 3asis functions/ The complete set of $CTs and
$STs. so)called discrete trigonometric transforms. consists of eight versions of $CT
and corresponding eight versions of $ST/ Since all present applications involve only
even $CT and $ST. the term paper considers only four types of even $CT and $ST/
The aspect for application of $CT and $ST is the e4istence of fast algorithms that allow
the efficient computation compared to the direct multiplication/ Over these years. many
fast algorithms for the efficient computation of one)dimensional $CT and $ST have
3een developed/ In general. they are 3ased on indirect computation or direct
computation. and they are generally classified as radi4)2. split)radi4 algorithms/
+owever. one of the most important re>uirements is that fast $CT or $ST algorithms
possess e4cellent numerical sta3ility/
This term paper discusses the fast rotation)3ased $CT:$ST algorithms/ #lmost all
are direct algorithms defined 3y the recursive spare matri4 of the transform matri4/ The
fast and numerically sta3le $CT:$CT algorithms with regular structure with only real
arithmetic are presented/ In particular. these rotation)3ased algorithms are very
convenient for the construction of integer transforms/ The generali7ed signal flow
graphs corresponding to the sparse matri4 decomposition of the transform matri4 are
also provided/
(
-
8/12/2019 Fast Algorithm for DCT
5/40
Fast DCT/DST Alorith!s
Chapter 1 The de#inition and relations o# DCT and DST
!atrices
9efore deriving the forms of orthonormal $CT and $ST matrices we recall their
definitions. 3asic mathematical properties and relations/Nis assumed to 3e an integer of
2/ # su3script in the matri4 notation denotes order of the matri4. while a superscript
denotes the type/
our orthonormal even $CTs in matri4 form denoted 3y (. . .I II III IV
N N N NC C C C + . are
respectively defined as
(
(( ( (
2cos . . 0.(. . .
. !2/("
2 !2cos
I
N k nkn
TI I IN N N
II
N kkn
nkC k n N
N N
C C C
CN
+
+ + +
= =
= =
=
K
(
(". . 0.(. . (.
2
. !2/2"
2 !2 ("cos .
2
TII II III
N N N
III
N nkn
n kk n N
N
C C C
k nC
N N
+ =
= =
+ =
K
(
. 0.(. . (.
. !2/8"
2 !2 ("!2 ("cos . . 0.(. .
'
TIII III II
N N N
IV
N kn
k n N
C C C
n kC k n N
N N
=
= =
+ + = =
K
K
(
(.
. !2/'"T
IV IV IV
N N NC C C
= =
where
(
2 0 or .
( otherwise.p
p p N
= ==
and the corresponding four orthonormal even $STs in matri4 form denoted 3y
2
-
8/12/2019 Fast Algorithm for DCT
6/40
(. . .I II III IV
N N N NS S S S (are respectively defined as
(
(
( ( (
2 ! ("! ("sin . . 0.(. . 2.
. !2/;"
2 !2sin
I
N kn
TI I I
N N N
II
N kkn
n kS k n N
N N
S S S
nS
N
+ + = =
= =
=
K
(
("! (". . 0.(. . (.
2
. !2/1"
2 !2 ("! ("sin .
2
TII II III
N N N
III
N nkn
kk n N
N
S S S
k nS
N N
+ + =
= =
+ + =
K
(
. 0.(. . (.
. !2/?"
2 !2 ("!2 ("sin . . 0.(. . (.
'
TIII III II
N N N
IV
N kn
k n N
S S S
n kS k n N
N N
=
= = + + = =
K
K
(
. !2/6"T
IV IV IV
N N NS S S
= =
where
(
2 > (.
( otherwise/q
N
= =
The $CT)I matri4 given 3y !2/(" is defined for orderN+1/ It is scaled version of
the symmetric cosine transform !SCT" forN = 2m+ 1/ The SCT matri4 denoted 3yI
NC%
isof orderNand it is defined as
(
2cos . . 0.(. . .
( (
. !2/&"
I
N k nkn
TI I I
N N N
nkC k n N
N N
C C C
= =
= =
% K
% % %
*here
(
2 0 or (.
( otherwise/p
p p N
= = =
Similarly. the $ST)I given 3y !2/;" is defined for orderN 1. and it is a scaled version
of the symmetric sine transform !SST" forN = 2m 1/ The SST matri4 denoted 3yI
NS%is
8
-
8/12/2019 Fast Algorithm for DCT
7/40
of orderNand it is defined as
(
2 ! ("! ("sin . . 0.(. . (.
( (
/ !2/(0"
I
Nkn
TI I I
N N N
n kS k n N
N N
S S S
+ + = = + + = =
% K
% % %
The $CT)II has e4cellent energy compaction and among the currently =nown
unitary transforms it is the 3est appro4imation to the optimal @AT/ The $CT and $ST
matrices given 3y !2/("B!2/(0" are real)valued and orthonormal/ The normali7ation
factors 2:N . 2:!N)(" and 2:!N(" in the forward and inverse transforms can 3e
merged as 2:N . 2:!N)(" and 2:N( . respectively. and moved either to the forward or
inverse transform/ If these normali7ation factors are merged these $CT and $ST
matrices will only 3e orthogonal/ The inverses of $CT and $ST matrices are simply
o3tained 3y transposing original matrices/ The matrices (I
NC .
IV
NC . (
I
NS .IV
NS including
I
NC%and
I
NS%. they are self)inverse/ The symmetry of the transform matri4 indicates that
the fast algorithms for the forward and inverse transform computation are identical/ The
matrices II
NC .
III
NC are inverses of each other/ The same property holds for matricesII
NS
andIII
NS / It means that fast algorithms for the inverse transform computation are
o3tained from the algorithms for forward transform computation performed in the
opposite direction/ To reduce the set of $CT and $ST matrices considered for the
efficient implementation we can e4ploit the relations 3etween $ST and their
corresponding $CT matrices/
II II
N N N NS J C D= .III III
N N N NS D C J = !2/(("
whereas the matri4IV
NS is related to
IV
NC matri4,
IV IV
N N N NS J C D= !2/(2"
'
-
8/12/2019 Fast Algorithm for DCT
8/40
where NJ is the cross)indentity matri4 and ND is the diagonal odd)sign changing
matri4 given 3y ( ){ }diag ( . 0. (. . (kND k N= = K / Conse>uently. the efficient
implementation of $ST)II and $ST)I< can 3e o3tained from those of $CT)II and $CT)
I< respectively 3y appropriate sign changes and reversal of order/
The matri4IV
NC is related toII
NC matri4
(
2(2 '
8(2 '
(2
! ("
'
(2
0 0 0
( 0 0 2cos 0
( ( 0 2cos !2/(8"
( ( 0
0 2cos
( ( (
N
NIV II
N N
N
N
C C
=
L
L
L
L O
M M M O M
L
inally. for the $CT and $ST matrices the following relations
( ( ( ( ( ( ( (. S .
I I I I
N N N N N N N NC J D C J D S + + + + = =
. S .II II II IIN N N N N NC D C D S = =
. .III III III III
N N N N N N N NJ C C D J S S D= =
( (! (" . ! (" / !2/('"N IV IV N IV IVN N N N N N N N N N N NC J D J D C S J D J D S
= =
These relations allow us to limit our discussion in su3se>uent sections to
( (. . . . andI I II IV I I
N N N N N NC S C C C S +
% % /
Chapter 2 The e)plicit #or!s o# DCT/DST !atrices
*e consider only the matrices ( (. . . . andI I II IV I I
N N N N N NC S C C C S +
% %. and we will derive
their e4plicit orthonormal forms for values of 2. '. and 6N= / The elements of $CT)I
;
-
8/12/2019 Fast Algorithm for DCT
9/40
matri4 (I
NC + are defined 3y !2/("/ or 2. '. and 6N= . we have the following e4plicit
forms,
( ( (2 22
( (8 2 2
( ( (2 22
0IC
=
.
( ( ( ( (2 22 2 2
( (' '2 2
( (
; 2 2
( (' '2 2
( ( ( ( (2 22 2 2
cos 0 cos
(0 ( 0
2cos 0 cos
IC
=
/
( ( ( ( ( ( ( ( (2 22 2 2 2 2 2 2
( (
6 ' 6 6 ' 62 2
( (' ' ' '2 2
( (6 ' 6 6 ' 62 2
( (
& 2 2
( (6 ' 6 6 ' 62 2
(' '2
cos cos sin 0 sin cos coscos 0 cos ( cos 0 cos
sin cos cos 0 cos cos sin
(0 ( 0 ( 0 ( 0
2sin cos cos 0 cos cos sin
cos 0 cos (
IC
=
(' ' 2
( (6 ' 6 6 ' 62 2
( ( ( ( ( ( ( ( (2 2
2 2 2 2 2 2 2
cos 0 cos
cos cos sin 0 sin cos cos
/
The elements of the $ST)I matri4 (I
NS are defined 3y !2/;"/ orN 5 2. the matri4
( )( (IS = is trivial andN5' and6. we have the following forms,
' '
8
' '
sin ( sin(
( 0 (2
sin ( sin
IS
=
.
1
-
8/12/2019 Fast Algorithm for DCT
10/40
6 ' 6 6 ' 6
' ' ' '
6 ' 6 6 ' 6
?
6 ' 6 6 ' 6
' ' ' '
6 ' 6 6 '
sin sin cos ( cos sin sin
sin ( sin 0 sin ( sin
cos sin sin ( sin sin cos(
( 0 ( 0 ( 0 (2cos sin sin ( sin sin cos
sin ( sin 0 sin ( sin
sin sin cos ( cos sin sin
I
S
=
6
/
The elements of the SCT matri4I
NC%are defined 3y !2/&"/ or values ofN 5 2. '
and 6. we have the following forms,
( (2 2
2 ( (2 2
2IC
= % .
( ( ( (2 22 2
( (8 82 2
' ( (8 82 2
( ( ( (
2 22 2
cos cos2
cos cos8
IC
=
% .
?
-
8/12/2019 Fast Algorithm for DCT
11/40
( ( ( ( ( ( ( (2 22 2 2 2 2 2
2 8 8 2( (? ? ? ? ? ?2 2
2 8 8 2( (? ? ? ? ? ?2 2
8 2 2 8( (? ? ? ? ? ?2 2
6 8 2 2 8( (? ? ? ? ? ?2 2
2(?2
cos cos cos cos cos cos
cos cos cos cos cos cos
cos cos cos cos cos cos2
cos cos cos cos cos cos?
cos
IC
=
%
8 8 2 (? ? ? ? ? 2
2 8 8 2( (? ? ? ? ? ?2 2
( ( ( ( ( ( ( (2 22 2 2 2 2 2
cos cos cos cos cos
cos cos cos cos cos cos
/
The elements of the SST matri4I
NS%are defined 3y !2/(0"/ or values ofN 5 2. '
and 6. we have the following forms,
28 8
2 28 8
sin sin2
sin sin8
IS
=
% .
2 2; ; ; ;
2 2; ; ; ;
' 2 2; ; ; ;
2 2; ; ; ;
sin sin sin sin
sin sin sin sin2
sin sin sin sin;
sin sin sin sin
IS
=
% .
2 ' ' 2& & 8 & & 8 & &
2 ' ' 2& & 8 & & 8 & &
8 8 8 8 8 8
' 2 2 '& & 8 & & 8 & &
6 '& & 8
sin sin sin sin sin sin sin sin
sin sin sin sin sin sin sin sin
sin sin 0 sin sin 0 sin sinsin sin sin sin sin sin sin sin2
sin sin sin s8
IS
=
%2 2 '& & 8 & &
8 8 8 8 8 8
2 ' ' 2& & 8 & & 8 & &
2 ' ' 2& & 8 & & 8 & &
in sin sin sin sin
sin sin 0 sin sin 0 sin sin
sin sin sin sin sin sin sin sin
sin sin sin sin sin sin sin sin
/
The elements of the $CT)II matri4II
NC are defined 3y !2/2"/ or values of N 5 2. '
and 6. we have the following forms,6
-
8/12/2019 Fast Algorithm for DCT
12/40
( (
2 2
2 ( (
2 2
IIC
= .
( ( ( (
2 2 2 2
6 6 6 6
' ( ( ( (
2 2 2 2
6 6 6 6
cos sin sin cos(
2
sin cos cos sin
IIC
=
.
( ( ( ( ( ( ( (
2 2 2 2 2 2 2 2
8 8 8 8(1 (1 (1 (1 (1 (1 (1 (1
6 6 6 6 6 6 6 6
8 8 8 8(1 (1 (1 (1 (1 (1 (1 (1
6 ( ( ( ( ( ( ( (
2 2 2 2 2 2 2 2
cos cos sin sin sin sin cos cos
cos sin sin cos cos sin sin cos
cos sin cos sin sin cos sin cos(
2
s
IIC
=
8 8 8 8(1 (1 (1 (1 (1 (1 (1 (1
6 6 6 6 6 6 6 6
8 8 8 8(1 (1 (1 (1 (1 (1 (1 (1
in cos sin cos cos sin cos sin
sin cos cos sin sin cos cos sin
sin sin cos cos cos cos sin sin
/
inally. the elements of the $CT)I< matri4IV
NC are defined 3y !2/'"/ or values of
N 5 2. ' and 6. we have the following forms,
6 6
2
6 6
cos sin
sin cos
IVC
=
.
8 8(1 (1 (1 (1
8 8(1 (1 (1 (1
' 8 8(1 (1 (1 (1
8 8(1 (1 (1 (1
cos cos sin sin
cos sin cos sin(
sin cos sin cos2
sin sin cos cos
IVC
=
.
&
-
8/12/2019 Fast Algorithm for DCT
13/40
8 ; ? ? ; 882 82 82 82 82 82 82 82
8 ? ; ; ? 882 82 82 82 82 82 82 82
; ? 8 8 ? ;82 82 82 82 82 82 82 82
? ; 882 82 82
6
cos cos cos cos sin sin sin sin
cos sin sin sin cos cos cos sin
cos sin cos cos sin sin cos sin
cos sin cos(2
IVC
=8 ; ?82 82 82 82 82
? ; 8 8 ; ?82 82 82 82 82 82 82 82
; ? 8 8 ? ;82 82 82 82 82 82 82 82
8 ? ; ; ?82 82 82 82 82 82 82
sin cos sin cos sinsin cos sin cos sin cos sin cos
sin cos sin sin cos cos sin cos
sin cos cos cos sin sin sin
8
82
8 ; ? ? ; 882 82 82 82 82 82 82 82
cos
sin sin sin sin cos cos cos cos
Investigating the 3asis vectors which rows of the matrices
( (
. . . andI I II I I
N N N N N
C S C C S +
% % . we o3serve that they e4hi3it certain symmetries/ The location
of symmetry center is determined 3y the length or order of the row. which is the discrete
num3er of elements/ *hen the order of the matri4 is even. the symmetry center is
located in midpoint 3etween adDacent center elements of the row vector/
(0
-
8/12/2019 Fast Algorithm for DCT
14/40
The #ast rotation$%ased DCT/DST alorith!s
In this chapter. we introduce the fast $CT:$ST algorithms !radi4)2 and split)radi4"
with regular structure/ ost of them are completely recursive and use only
permutations. 3utterfly operations and rotations/ Compared to direct matri4 vector
multiplication for a given N)length input data vector. the fast $CT:$ST algorithms of
radi4)2 reduce the computational comple4ity appro4imately from 2 2N arithmetic
operation ! 2N multiplications and ! ("N N additions" to 22 logN Narithmetic
operations ! 22 logN Nmultiplications and 2logN N additions"/
Chapter 1 The #ast DCT$I and SCT alorith!s
(/( $CT)I computation 3ased on the SCT algorithm for 2 (mN= +
#ssuming 2 (mN= + . the SCT algorithm can 3e adapted for a new fast recursive
computation of the $CT)I as follows/
UsingI
kc%and nx to represent the k)th and n)th elements of the transformed vector
and input vector. respectively. !2/&" for (M N= can 3e e4pressed as
0
2cos . 0.(. . /
MI
k k n n
n
nkc x k M
M M
=
= =% K
Ignoring the scaling factors. the essential part of the a3ove sum can 3e e4pressed as
(
0
cos ! (" ! " ! (" . 0.(. . /M
I k k
k n M M M
n
nkc x x a k x k M
M
=
= + = + =% K
Splitting the sum ! "Ma k into even)inde4ed and odd)inde4 points and applying the
((
-
8/12/2019 Fast Algorithm for DCT
15/40
symmetries of transform =ernels the complete formula are given 3y
2 2 2
! " ! " ! ". 0.(. . . ! " 0/2 2
M M MM
M Ma k a k b k k b= + = =K
2 2
! " ! " ! ". 0.(. . (.2
M MM
Ma M k a k b k k = = K
!8/("
where
2
2
(
2
0 2
! " cos .
M
M nM
n
nka k x
=
=
2
2
(
2 (
0 2
!2 ("! " cos /
2! "
M
M nM
n
n kb k x
+=
+=
!8/2"
The ! ("M+ )point $CT)I is recursively decomposed into 2! ("M + )point $CT)I and
2M )point $CT)II/ The corresponding generali7ed signal flow graph for the forward
and inverse $CT)I computation forN5 8. ; and & is shown in ig/ 8/(/ The input data
se>uence{ }nx is in 3it)reversed order/ ull lines in ig 8/( represent unity transfer
factors while 3ro=en lines represent transfer factor ( / d represents addition and
represents multiplication after addition/
(2
-
8/12/2019 Fast Algorithm for DCT
16/40
ig/ 8/( The generali7ed signal flow graph for the forward and inverse $CT)I
computation for N=8. ; and & 3ased on the SCT algorithmF 22 /a =
On the other hand. the N)point SCT defined 3y !2/&" for 2nN= . without the
scaling factors can 3e e4pressed as
(
0
cos ! ". 0.(. . (.(
NI
k n N
n
nkc x a k k N
N
=
= = = % K
and decomposed 3y similar method used in !8/(" and !8/2" into 2N )point SCT and 2
N )
point $CT)II as follows
(8
-
8/12/2019 Fast Algorithm for DCT
17/40
2 2
! " ! " ! ". 0.(. . (.2
N NN
Na k a k b k k = + = K
2 2! ( " ! " ! ". 0.(. . (/2N NN
N
a N k a k b k k = = K !8/8"
where
2
2
(
2
0
! " cos .(
N
N n
n
nka k x
N
=
=
2
2
(
2 (
0
!2 ("! " cos /
(
N
N n
n
n kb k x
N
+=
+=
!8/'"
Unfortunately. the decomposition of the SCT given 3y !8/8" and !8/'" is not recursive
implying that the SCT matri4I
NC%does not have a recursive structure/
(/2 $CT)I recursive sparse matri4 factori7ations
The $CT)I matri4 (I
NC + for 2
mN= can 3e factori7ed into the following recursive
sparse matri4,
2 2
2
2 2 2
2 2
(
( (
02
0
N N
N
N N N
N N
I
I
N N III
I JC
C PJ C J
J I
+
+ +
=
. !8/;"
where (NP+ is a permutation matri4. when it is applied to a data vector it corresponds to
the reordering
0 0 ( 2 2 2 (. . . 0.(. . (2
n n N n n
Nx x x x x x n+ + += = = = % % % K / !8/1"
The ! ("N+ )point $CT)I is decomposed recursively into 2! ("N + )point $CT)I and
2N
)point $CT)III/ The generali7ed signal flow graph for the forward and inverse $CT)I
('
-
8/12/2019 Fast Algorithm for DCT
18/40
computation for 2. ' and 6N= is shown in ig/ 8/2/
ig/ 8/2 The generali7ed signal flow graph for the forward and inverse $CT)I
computation for N=2. ' and 6 3ased on recursive sparse matri4 factori7ation !8/;"F
22
a =
(/8 The split)radi4 $CT)I algorithm
The complete formulae of split)radi4 fast $CT)I algorithm are given 3y
2
22
0 2
! " cos cos . 0.(. . .2
N
N
I
k n N nN
n
nk Nc x x x k k
=
= + = K
' ( . (. 2. .
'
I
k k k
Nc a b k = + = K
(;
-
8/12/2019 Fast Algorithm for DCT
19/40
' ( 0. 0.(. . (. 3 0.'
I
k k k
Nc a b k + = = =K !8/1"
where
( ) ( ) ( )'
82 2 ' '
0
cos sin cos cos !' (".'
'
N
N N N Nk n N n n nn
n n nk a x x x x x x k
NN N
+=
= + +
( ) ( )'
2 2
(
(
sin cos sin /
'
N
N Nk n N n n nn
n n nk b x x x x
NN N
+=
= !8/?"
Thus. the first stage of split)radi4 decomposition replaces ! ("N+ )point $CT)I 3y
one 2! ("N + )point $CT)I. one $CT)I of length '! ("
N + and one $ST)I of length '! ("N /
The decomposition is used recursively/ It results in the generali7ed signal flow graph
with regular structure which is shown for N= (. ' and 6 in ig/ 8/8/ The output data
se>uence{ }Ikc is in 3it)reversed order/ #ccording to split)radi4 $CT)I algorithm. the
matri4 (INC + can 3e recursively factori7ed as follows
2 2
2
2
2 2
(
( (
02
0
N N
N
N
N N
I
I
N N
I JC
C PK
J I
+
+ +
=
. !8/6"
where
' '
'
2 ' ' 2
'
( (
(
( (
(
2
0 00 .
00 0
N N
N
N N N N
N
I
I
I J SK J I R
CI
+
=
%
%
!8/&"
and
(' 2
( 2 2(' 2
cos ( (( (. . /
( (cos 22
IK K C
= = =
!8/(0"
(1
-
8/12/2019 Fast Algorithm for DCT
20/40
(NP+ is a permutation matri4 for reordering from 3it)reversal to natural order/
'(N
IS % and
' (N
IC+
%
are unnormali7ed $ST)I and $CT)I matrices respectively given 3y
' ' ' ' '( ( ( ( (
! " .N N N N NI IS J B S J =
%
' ' ' ' '( ( ( ( (
! "N N N N NI IC J B C J + + + + +=
% .
where'
(NB and
'(N
B + are 3it)reversal permutation matrices/2
NR is a rotation matri4 given
3y
' '
2 ' '
2 2
! (" ! ("
'
! (" ! ("
2 2
(
2
cos 0 sin 0
cos sin
/ /
cos sin
0 cos 0/
sin cos
/ /
sin cos
sin cos 0
0 0 0
N N
N N N
N N
N N
N N
N N
N N
N N
R
=
*e note that the recursive sparse matri4 factori7ation of $CT)I matri4 given 3y
!8/6")!8/(0" is valid for N= 2. ' and 6/
(?
-
8/12/2019 Fast Algorithm for DCT
21/40
ig/ 8/8 The generali7ed signal flow graph for the forward and inverse $CT)I
computation for N=2. ' and 6 3ased on the split)radi4 algorithmF 22a =
Chapter 2 The #ast DST$I and SST alorith!s
2/( $ST)I computation 3ased on the SST algorithm for 2 (mN= +
#ssuming 2 (mN= . the SST algorithm can 3e adapted for a new fast recursive
computation of the $ST)I as follows/
%ewriting !2/(0" for (M N= + we have
(6
-
8/12/2019 Fast Algorithm for DCT
22/40
0
2 ! ("! ("sin ! ". 0.(. . 2/
MI
k n M
n
n ks x a k k M
M M
=
+ += = = % K
Ignoring the scaling factor we can split the sum ! "Ma k into even)inde4ed and odd)
inde4ed points and using the symmetries of transform =ernels. the complete formulae
are given 3y
2 2 2
! " ! " ! ". 0.(. . (. a ! (" 0.2 2
M M MM
M Ma k b k a k k = + = =K
2 2
! 2" ! " ! ". 0.(. . 2.
2
M MM
Ma M k b k a k k = = K !8/(("
where
2
2
2
2 (
0 2
! ("! ("! " sin .
M
M nM
n
n ka k x
+=
+ +=
2
2
(
2
0 2
!2 ("! ("! " sin /
2! "
M
M nM
n
n kb k x
=
+ += !8/(2"
+ence. the ! ("M )point $ST)I is recursively decomposed into an 2M )point $ST)II
and an 2! ("M )point $ST)I/ The corresponding generali7ed signal flow graph for the
forward and inverse $ST)I computation for N= 8 and ? is shown in ig/ 8/'/ The input
data se>uence{ }nx is in 3it)reversed order/ The output data se>uence { }I
ks is in reverse
order/
(&
-
8/12/2019 Fast Algorithm for DCT
23/40
ig/ 8/' The generali7ed signal flow graph for the forward and inverse $ST)I
computation for N=8 and ? 3ased on the SST algorithm/
On the other hand. the N)point SST defined 3y !2/(0" for 2nN= . without the
scaling factors can 3e e4pressed as
(
0
! ("! ("sin ! ". 0.(. . (.
(
NI
k n N
n
n ks x a k k N
N
=
+ += = =
+% K
and decomposed 3y similar methods used in !8/((" and !8/(2"into 2N )point SST and 2N )
point $ST)II as follows
2 2
! " ! " ! ". 0.(. . (.2
N NN
Na k b k a k k = + = K
2 2
! (" ! " ! ". 0.(. . (/2
N NN
Na N k b k a k k = = K !8/(8"
where20
-
8/12/2019 Fast Algorithm for DCT
24/40
2
2
(
2 (
0
! ("! ("! " sin .
(
N
N n
n
n ka k x
N
+=
+ +=
+
2
2
(
2
0
2 !2 ("! ("! " sin /(
N
N n
n
n kb k xN
=
+ +=+
!8/('"
Unfortunately. the decomposition of the SST given 3y !8/(8" and !8/('" is not recursive
implying that the SST matri4I
NS%does not have a recursive structure/
2/2 $ST)I recursive sparse matri4 factori7ations
The $ST)I matri4 (I
NS for 2
mN= can 3e factori7ed into the following recursive
sparse matri4,
2 2
2 2
2 2
2 2
( (
( (
( ( ( (
02
0
N N
N N
N N
N N
III
I
N N I
I JB S
S BS J
J I
=
. !8/(;"
where (NB is a permutation matri4 permuting the transformed data se>uence from the
3it)reversal order to natural order/ The corresponding generali7ed signal flow graph for
N= ' and 6 is shown in ig/ 8/;/
2(
-
8/12/2019 Fast Algorithm for DCT
25/40
ig/ 8/; The generali7ed signal flow graph for the forward and inverse $ST)I
computation for N=' and 6 3ased on !8/(;"
# slightly different alternative sparse matri4 factori7ation of the $ST)I matri4 for
2mN= is defined as
2 2
2
2 2 2
2 2
( (
( (
( ( (
( (
02
0
N N
N
N N N
N N
III
I
N N I
I JS
S PJ S J
J I
=
. !8/(1"
where (NP is a permutation matri4 which when it is applied to a data vector
corresponds to the reordering
0 0 ( 2 2 2 2 (. . . 0.(. . 22
n n N n n
Nx x x x x x n+ + += = = = % % % K / !8/(?"
The generali7ed signal flow graph for the forward and inverse $ST)I computation
for ' and 6N= is shown in ig/ 8/1/ In 3oth factori7ations !8/(;" and !8/(1". the
22
-
8/12/2019 Fast Algorithm for DCT
26/40
! ("N )point $ST)I is decomposed recursively into 2! ("N )point $ST)I and 2
N )point
$ST)III/
ig/ 8/1 The generali7ed signal flow graph for the forward and inverse $ST)I
computation for N=' and 6 3ased on !8/(1"
2/8 The split)radi4 $ST)I algorithm
The complete formulae of split)radi4 fast $ST)I algorithm are given 3y
2(
2
( 2
! "sin . (. . (.2
N
I
k n N nN
n
nk Ns x x k
=
= = K
' ( . (. 2. .'
I
k k k
Ns a b k = = K
' ( 0. 0.(. . (. a 0.'
I
k k k
Ns a b k+ = + = =K !8/(6"
where
( ) ( )'
2 2
(
(
cos sin sin .
'
N
N Nk n N n n nn
n n nk a x x x xNN N
+=
= + +
28
-
8/12/2019 Fast Algorithm for DCT
27/40
( ) ( ) ( )'
82 2 2 ' '
0
sin cos cos cos !' ("'
'
N
N N N N Nk n N n n nn
n n nk b x x x x x x x k
NN N
+=
= + + + + + . !8/(&"
+ence. the first stage of split)radi4 decomposition replaces ! ("N )point $ST)I 3y
one 2! ("N )point $ST)I. one $ST)I of length '! ("
N and one $CT)I of length '! ("N + /
The decomposition is used recursively/ It results in the generali7ed signal flow graph
with regular structure which is shown for N= ' and 6 in ig/ 8/?/ The output data
se>uence{ }Iks is in 3it)reversed order/ #ccording to split)radi4 $ST)I algorithm. the
matri4 (I
NS can 3e recursively factori7ed as follows
2 2
2
2 2
2 2
( (
( (
0(
0
N N
N
N N
N N
I
N N I
I JK
S PS J
J I
=
. !8/20"
where
' '
'' '
2 2 2
'
( (
(( (
(
0 0
00 0.
00 0 ( 0
0 0 0 (
N N
NN N
N N N
N
I
I
I J
SJ IK Q R
C
+
=
%
%
!8/2("
and
'
2 2 '
'
0 0 0 (sin ( ( 0 0 0(
. /sin ( 0 ( 0 02
0 0 ( 0
IIIK S Q
= = =
!8/22"
(NP is a permutation matri4 for reordering from 3it)reversal to natural order/
'(N
IS % and
'(N
I
C +%
are unnormali7ed $ST)I and $CT)I matrices respectively given 3y
2'
-
8/12/2019 Fast Algorithm for DCT
28/40
' ' ' ' '( ( ( ( (
! " .N N N N NI IS J B S J =
%
' ' ' ' '( ( ( ( (
! "N N N N NI IC J B C J + + + + +=
% .
where'
(NB and
'(N
B + are 3it)reversal permutation matrices/2
NR is a rotation matri4 given
3y
' '
2 ' '
2 2
! (" ! ("
'
! (" ! ("
2 2
(
2
cos 0 sin 0
cos sin
/ /
cos sin
0 sin 0/
sin cos
/ /
sin cos
sin cos 0
0 0 0
N N
N N N
N N
N N
N N
N N
N N
N N
R
=
*e note that the recursive sparse matri4 factori7ation of $ST)I matri4 given 3y !8/20")
!8/22" is valid for N= ' and 6/
2;
-
8/12/2019 Fast Algorithm for DCT
29/40
ig/ 8/? The generali7ed signal flow graph for the forward and inverse $ST)I
computation for N=' and 6 3ased on split)radi4 algorithm/
Chapter " The #ast DCT$II/DST$II and DCT$III/DST$III
alorith!s
The first direct real)valued fast algorithm for the $CT)II computation which is
3ased on the recursive sparse matri4 factori7ation of $CT)II transform matri4 defined as
2 22
2 22 2
0.
0
N NN
N NN N
IIII
N N IV
I JCC B
J IC J
=
)
!8/28"
where2
N
IIC is the $CT)II matri4 of half si7e with 3it)reverse reordered rows. and2 2
N N
IVC J is
the $CT)I< matri4 of half si7e with 3it)reverse reordered rows and its columns in
reverse order/ NB
is a permutation matri4 which permutes the transformed data se>uence
from the 3it)reverse order to natural order/ The recursive sparse matri4 factori7ation has
21
-
8/12/2019 Fast Algorithm for DCT
30/40
3ecome the fundamental form in the su3se>uent development of the direct real)valued
fast $CT)II algorithms and it has initiated an e4tensive search to find an optimal
factori7ation of the $CT)I< matri4/ Essentially. the recursive sparse matri4 factori7ation
!8/28" is evident when we consider the $CT)II in the form of a sum as follows,
(
0
!2 ("cos . 0.(. . (/
2
NII
k n
n
n kc x k N
N
=
+= = K
Splitting the sum into even)inde4ed and odd)inde4ed transform coefficients and using
the symmetries of the transform =ernels we get
2(
2 (
0 2
!2 ("! " cos .
2! "
N
II
k n N nN
n
n kc x x
=
+= +
2(
2 ( (
0 2
!2 ("!2 ("! " cos . 0.(. . (
'! " 2
N
II
k n N nN
n
n k Nc x x k
+ =
+ += = K /
Thus. theN)point $CT)II is recursively decomposed into an 2N )point $CT)II and an 2
N )
point $CT)I
-
8/12/2019 Fast Algorithm for DCT
31/40
ig/ 8/6 The generali7ed signal flow graph for the $CT)II computation for N=2. '
and 6 3ased on !8/28"/
# slightly different recursive sparse matri4 factori7ation of the $CT)II matri4 and
for completeness of the $ST)II matri4 are respectively defined as
2 22
2 22 2 2
0.
0
N NN
N NN N N
II
II
N N IV
I JCC P
J IJ C J
=
!8/2'"
2 22
2 22 2 2
0.
0
N NN
N NN N N
IV
II
N N II
I JSS P
J IJ S J
=
!8/2;"
26
-
8/12/2019 Fast Algorithm for DCT
32/40
where NP is a permutation matri4 which reorders the transformed vector such that the
first half are even)inde4ed coefficients in natural order. while the second half are odd)
inde4ed coefficients 3ut in reverse order/ The generali7ed signal flow graph for the
$CT)II computation with the proposed factori7ation of2
N
IVC for N= 2. ' and 6 is shown
in ig/ 8/&
ig/ 8/& The generali7ed signal flow graph for the $CT)II computation for N=2. '
and 6 3ased on !8/2'"/
The orthogonal recursive spares matri4 factori7ation ofII
NC matri4 for
2 . (mN m= > is in the form
2 22
2 22
2 2 22
2 2 22
0 2
20
00 2.
0 20
N NN
N NN
N N NN
N N NN
II
II T
N N IV
II
T
N IV
I JCC P
I JC
I I JCP
J J IC
=
=
!8/21"
2&
-
8/12/2019 Fast Algorithm for DCT
33/40
where NP is a permutation matri4. when it is applied to a data vector it corresponds to
the reordering
22 2 (. . 0.(. . (
2Nn n nn
Nx x x x n++= = = % % K / !8/2?"
Note that(T
N NP P
= / 9y com3ining the orthogonal recursive sparse matri4
factori7ation of2
N
IIC and2
N
IVC / *e o3tain the following factori7ation ofII
NC matri4,
'
22 '
22 '
'
2
2
!0 "
!0"
!("
00
00
0.
0
N
NN N
NN N
N
N
N
II
T IVII T
N N T II
II
N
C
IP CC P
AP C
C
TT
T
=
!8/26"
where 2NA
. 2!("
N
Tand 2
!0 "
N
T.
!0"
NT are orthogonal matrices defined as
' ' '
2
' ' ' '
( (
( (
2 0
02.
02
0 2
N N N
N
N N N N
I I IA
I I D J
=
80
-
8/12/2019 Fast Algorithm for DCT
34/40
'
2
'
!("
' '
' '
0
0
cos sin2 2
2 2cos sin
2 2
/ /
! (" ! ("cos sin
2 2.
! (" ! ("sin cos
2 2
/ /
2 2sin cos
2 2
sin cos2 2
N
N
N
N N
N N
IT
D
N N
N N
N N
N N
N N
N N
=
' ' ' ' '
2
' ' ' ' '
!0"02 2
.02 2
N N N N N
N
N N N N N
I J I I JT
I J J J I
= =
*here ' 'diag! (" H. 0.(. (Nk N
D k= = K is the diagonal odd)sign changing matri4/
The corresponding generali7ed signal flow graph for the $CT)II computation for 2N=
. ' and6 is shown in ig/ 8/(0/ or 6N= . each output coefficient should 3e normali7ed
3y scaling factor 2'
to get the true $CT)II coefficients/
8(
-
8/12/2019 Fast Algorithm for DCT
35/40
ig/ 8/(0 The generali7ed signal flow graph for the $CT)II computation for N=
2. ' and 6 3ased on !8/26"F 2a = /
inally. it is important to present a practical fast algorithm from the class of fast 6)
point $CT)II algorithms generated from the full matri4 e>uation in a systematic way
using graph transformations and e>uivalence relations/ In the definition of $CT)II the
scaling constant 2 has 3een introduced which resulted in 2 (k = for 0k= allowing
for the coefficient 0IIc to 3e evaluated without any multiplication/ The 6)point $CT)II
computation re>uire (( multiplication and 2& additions. thus achieving the theoretical
lower 3ound of the num3er of multiplications for 6N= / The corresponding signal flow
graph for scaled $CT)II computation for 6N= is shown in ig/ 8/((/ or 6N= . each
82
-
8/12/2019 Fast Algorithm for DCT
36/40
output coefficient should 3e normali7ed 3y scaling factor(
6 to get the true $CT)II
coefficients/
ig/ 8/(( The signal flow graph for the scaled $CT)II computation for N=6F
2a = /
In Section 8)()8. we o3serve from the split)radi4 fast $CT)I algorithm that there
e4ists a factori7ation of N)point $CT)III matri4III
NC 3ased on 2! ("
N )point $ST)I and
2! ("N + )point $CT)I/ #ctually. such orthogonal recursive sparse matri4 factori7ation of
III
NC and
III
NS transform matrices with scaling 2 are respectively defined as
88
-
8/12/2019 Fast Algorithm for DCT
37/40
-
8/12/2019 Fast Algorithm for DCT
38/40
Chapter ' The #ast DCT$I*/DST$I* alorith!s
In general. the efficient computation of $CT)I
-
8/12/2019 Fast Algorithm for DCT
39/40
Conclusion
The fast ()$ $CT:$ST algorithms for all even types of $CT and $ST have 3een
discussed in detail/ #lmost all are direct algorithms defined 3y the !recursive. if it
e4ists" sparse matri4 factori7ation of the transform matri4/ The definition. 3asic
mathematical properties and relations 3etween corresponding $CT and $ST have 3een
3riefly discussed/ The e4plicit forms of orthonormal $CT and $ST matrices for N52. '
and 6 have 3een presented/ In particular. these rotation)3ased fast algorithms are very
convenient for the construction of corresponding integer transforms/ The generali7ed
signal flow graphs corresponding to the sparse matri4 factori7ation of the transform
matri4 for the fast $CT:$ST computation have also 3een provided and they are ready to
3e used in practical applications/
81
-
8/12/2019 Fast Algorithm for DCT
40/40
REFERENCE
(J +/ S/ +ou. K# fast recursive algorithm for computing the discrete cosine
transform. L I T!ans" Ac#$s%"& Sp''c(& S)*na P!#c'ss)n*&