koz scalable audio

25
2005/11/10 1 KOZ Scalable Audio Speaker: 陳陳陳 An Introduction

Upload: karah

Post on 03-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

KOZ Scalable Audio. An Introduction. Speaker: 陳繼大. References. K. M. Short et al, "An Introduction to the KOZ Scalable Audio Compression Technology", AES 118th Convention Paper, Barcelona, May 2005, Preprint 6446 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: KOZ Scalable Audio

2005/11/10 1

KOZ Scalable Audio

Speaker:陳繼大

An Introduction

Page 2: KOZ Scalable Audio

2005/11/10 P 2 2

References

K. M. Short et al, "An Introduction to the KOZ Scalable Audio Compression Technology", AES 118th Convention Paper, Barcelona, May 2005, Preprint 6446

M. K. Johnson, "Controlled Chaos and Other Sound Synthesis Techniques," Thesis for the Degree of Bachelor of Science, University of New Hampshire, May 2000

Douglas J. Nelson, and Kevin M. Short, “A channelized cross spectral method for improved frequency resolution.”, Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis. IEEE Press, October 1998.

Page 3: KOZ Scalable Audio

2005/11/10 P 3 3

References (cont.)

“KOZ scalable audio compression” SO/IEC JTC 1/SC 29/WG11 M12253

Page 4: KOZ Scalable Audio

2005/11/10 P 4 4

Outline

Introduction Double-scroll oscillator High Freq. Resolution Analysis Unified Domain KOZ Scalable Audio Results Conclusions

Page 5: KOZ Scalable Audio

2005/11/10 P 5 5

Introduction

Traditional transform/subband based codecs encode data by quantizing the coefficients according to psychoacoustic model

Parametric coding is another way for coding – it records the parameters of models, rather than coefficients.

KOZ scalable audio belongs to parametric coding methods

KOZ scalable audio takes Chaos system as the model

The Chaotic system is a nonlinear system

Page 6: KOZ Scalable Audio

2005/11/10 P 6 6

Introduction (cont.)

Features of the KOZ codec Flexibility over a wide range of bitrates Both small-step and large-step

scalability High resolution objects allows easy

decoder-side post-processing Integrated Digital Rights Management

Page 7: KOZ Scalable Audio

2005/11/10 P 7 7

Double-scroll oscillator

Chaotic system: nonlinear dynamical systems deterministic mathematical object sensitive dependence on initial

conditions predictable over a short period of time unpredictable in terms of long-term

behavior

Page 8: KOZ Scalable Audio

2005/11/10 P 8 8

Double-scroll oscillator (cont.)

Cupolets: Output periodic waveforms of Chaotic

system control process requires only on the

order of 16 bits of information but the cupolets can be as simple as a

sine wave or so complex that they have more than 200 harmonics in their spectrum

Page 9: KOZ Scalable Audio

2005/11/10 P 9 9

Double-scroll oscillator (cont.)

A chaotic system will settle down onto a complicated structure called an attractor – settle down onto the same attractor no matter what initial conditions are used

A chaotic system in its natural state is aperiodic.

To stabilize these orbits – simply perturbing the state of the system in certain fixed locations by a tiny amount.

Page 10: KOZ Scalable Audio

2005/11/10 P 10 10

Double-scroll oscillator (cont.)

Double-scroll oscillator: One kind of chaotic system Nonlinear differential equations

Page 11: KOZ Scalable Audio

2005/11/10 P 11 11

Double-scroll oscillator (cont.)

where

Parameters: C, L, G, m, B

Page 12: KOZ Scalable Audio

2005/11/10 P 12 12

Double-scroll oscillator (cont.)

Page 13: KOZ Scalable Audio

2005/11/10 P 13 13

Double-scroll oscillator (cont.)

Double-scroll attractor can be controlled in such a way that the trajectories around it become periodic.

Control perturbing: a bit string, generally of 16 bits applied at an intersection with the

control line periodic orbits are in one-to-one

correspondence with the control string used, independent of the initial state of the system

Page 14: KOZ Scalable Audio

2005/11/10 P 14 14

Double-scroll oscillator (cont.)

Page 15: KOZ Scalable Audio

2005/11/10 P 15 15

High Freq. Resolution Analysis

Detect (the accurate freq.) of tones IF-based methods

Differentiation of the signal phase fail completely if the signal environment

consists of more than one sinusoid CPS (Cross Power Spectral)

Time-averaged IF method Phase differentiation is applied to a time-

varying Fourier transform Fourier transform is used to “channelize”

the signal isolating the tones

Page 16: KOZ Scalable Audio

2005/11/10 P 16 16

High Freq. Resolution Analysis (cont.)

Improved (channelized) CPS estimator CPS can not detect and estimate tones

which are not well separated Employ a second Fourier transform

TVFT: Time varying Fourier transform dtetwTtfwTF ti

)()(),(

Page 17: KOZ Scalable Audio

2005/11/10 P 17 17

High Freq. Resolution Analysis (cont.)

Channelized CPS if f(t) is tone

2* |),(|),2/(),2/(

TFeTFTF i

))),2/(),2/(arg(1(lim

))),((arg(

00*

0

0

TFTF

TFT

)))),2/(),2/(arg(1(lim( 00

*

0

TFTFE

Page 18: KOZ Scalable Audio

2005/11/10 P 18 18

High Freq. Resolution Analysis (cont.)

Page 19: KOZ Scalable Audio

2005/11/10 P 19 19

Unified Domain

Convert the multiple channels into Special unitary group

Special unitary group group of n×n unitary matrices subgroup of the unitary group SU(2)

1,),(

bbaaab

babaU

Page 20: KOZ Scalable Audio

2005/11/10 P 20 20

KOZ Scalable Audio

Page 21: KOZ Scalable Audio

2005/11/10 P 21 21

KOZ Scalable Audio (cont.)

Prioritize the components Psychoacoustics are used in order of perceptual importance

Classes of objects are then sorted in order of their “perceptual relevance”

Objects are segregated and written to the floating-point .CCA file format.

Scalability of KOZ is fulfilled by sorting.

Page 22: KOZ Scalable Audio

2005/11/10 P 22 22

KOZ Scalable Audio (cont.)

Page 23: KOZ Scalable Audio

2005/11/10 P 23 23

KOZ Scalable Audio (cont.)

Page 24: KOZ Scalable Audio

2005/11/10 P 24 24

Results

Page 25: KOZ Scalable Audio

2005/11/10 P 25 25

Conclusions

KOZ scalable audio takes chaotic system to model audio signal

CPS is applied to find tones and their accurate freq.

Scalability is fulfilled by sorting classes of objects with order of their “perceptual relevance”