![Page 1: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/1.jpg)
Acoustics Research Institute
Austrian Academy of Science
MPEG-7 Today‘s Multimedia Standard
Peter Balazshttp://www.kfs.oeaw.ac.at
Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: [email protected]; http://www.kfs.oeaw.ac.at
OeAW-ISF
Peter Balazs1999 started as programmer at the ISF2001 finshed mathematics (University of Vienna)
![Page 2: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/2.jpg)
MPEG-7
OeAW-ISF
• ISO / IEC Standard„Mulitmedia Content Description Interface“
• Multimedia data / metadata description systemLow Level – High Level; content based
• Open systemInheritance
• Description of methodsnormativ – informativ
![Page 3: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/3.jpg)
MPEG-7
OeAW-ISF
• ISO / IEC Standard„Mulitmedia Content Description Interface“
• Multimedia data / metadata description systemLow Level – High Level
• Open systemInheritance
• Description of methodsnormativ – informativ
<AudioDescriptorxsi:type="SoundModelStatePathType"> <SoundModelRef>IDDogBarks</SoundModelRef>
<StateRef>IDState1</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState2</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState3</StateRef> <RelativeFrequency>0.045</RelativeFrequency> <StateRef>IDState4</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState5</StateRef> <RelativeFrequency>0.442</RelativeFrequency> <StateRef>IDState6</StateRef> <RelativeFrequency>0.513</RelativeFrequency>
</AudioDescriptor>
![Page 4: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/4.jpg)
MPEG-7
OeAW-ISF
• History
Call for Proposals October 1998
Evaluation February 1999
First version of Working Draft (WD) December 1999
Committee Draft (CD) October 2000
Final Committee Draft (FCD) February 2001
Final Draft International Standard (FDIS) July 2001
International Standard (IS) September 2001
• Development
Amendment Audio May 2002
Call for Proposals (Systems, version 2) July 2002
MPEG 21 international standard April 2009
![Page 5: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/5.jpg)
XML = eXtensible Markup Language
XML
OeAW-ISF
<?xml version=„1.0“>
• Metasprache
• Hypertext
• Markup markup = tag <Befehl> ... </Befehl>
• Open Standard <?xml version=„1.0“>
<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,
Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>
<?xml version=„1.0“>
<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,
Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
<?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [
<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>
<!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
![Page 6: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/6.jpg)
XML = eXtensible Markup Language
XML
OeAW-ISF
• Metasprache
• Hypertext
• Markup markup = tag <Befehl> ... </Befehl>
• Open Standard <?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [
<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>
<!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
<Set ID="Viewer3" RunMode="Multiple> <Table ID="Settings"> CursorOpts = 0 0 1 440 SignalOpts = 1 1 </Table> <Set ID="Profiles"> <Table ID="Default"> FrameOpts = 40 1 75 2 0 1 GraphXY = 0 1e4 1 -80 50 1 Method = 0 32 20 0 1 0 0 0 1 0 0 Average = 0 0 99 </Table> </Set></Set>
![Page 7: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/7.jpg)
MPEG-7
OeAW-ISF
• DescriptorsLow Level
• Descriptor SchemesHigh Level, container
• Descriptor Definition Language (DDL)XML Schema, STX Schema
• System ToolsASCII Text - binary
![Page 8: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/8.jpg)
MPEG-7
OeAW-ISF
Out of [1]
![Page 9: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/9.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors
• Single Sample
• SegmentsDS, compare to STX
Out of [1]
![Page 10: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/10.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors
• Scalar
• Vector
• Single
• Seriesseries of vectors
= table, matrix
• Scalable Series Out of [2]
![Page 11: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/11.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
![Page 12: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/12.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
![Page 13: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/13.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity,
AudioFundamentalFrequency
![Page 14: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/14.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
![Page 15: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/15.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation
![Page 16: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/16.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]
![Page 17: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/17.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]
![Page 18: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/18.jpg)
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [2]
• Silence
Out of [1]
![Page 19: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/19.jpg)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
![Page 20: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/20.jpg)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
![Page 21: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/21.jpg)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
![Page 22: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/22.jpg)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;
SoundModelStatePath, SoundModelStateHistogram
![Page 23: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/23.jpg)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;
SoundModelStatePath, SoundModelStateHistogram
• SpokenContentDescription Tools SpokenContentHeader : WordLexicon, PhonLexicon;
SpokenContentLattice: WordLinks, PhonLinks.
![Page 24: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/24.jpg)
OeAW-ISF
MPEG-7 Audio: Amendment
• New Base typesoptional attribute for channel
• Modification of Spoken Content Description Tools„acoustics only“ score possible for speech recognition; prosody, syllabels
• Audio Signal Quality DSBackgroundNoiseLevel, BalanceType, DCoffsetType, BandwidthType.
TransmissionTechnologyType: shellac, vinyl,....
• Additional Tools:tempo description, compact variable precision representation (BAM)
• Liguistic Description Tools:semantic structure of liguistic data
![Page 25: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs Institut für Schallforschung](https://reader036.vdocuments.pub/reader036/viewer/2022062619/551618af550346a2308b5645/html5/thumbnails/25.jpg)
OeAW-ISF
MPEG-7
Literatur:
[1] José M. Martínez, MPEG-7 Overview (version 8) ISO/IEC JTC1/SC29/WG11N4980, Klagenfurt, July 2002, http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm
[2] ISO / IEC, Information Technology – Multimedia Content Description Interface – Part 4: Audio, Geneva, July 2001
[3] Oliver Pott, Günter Wielange, XML Praxis und Referenz, München 2001
[4] J. Bitzer, J. H. Martínez, Information Technology — Multimedia Content Description Interface — Part 4: Audio — Proposed Draft Amendment , Fairfax, May 2002
Links:
[4] MPEG Home Page, http://mpeg.telecomitalialab.com/
[5] Extensible Markup Language, http://www.w3.org/XML/
[6] STX, http://www.kfs.oeaw.ac.at/software.htm