[the band sig] mpeg7 - audio
DESCRIPTION
[The Band SIG] MPEG7 - Audio. 손우람 2007 년 12 월 1 일. Why MPEG-7?. MPEG standards. 압축 (Compression) MPEG-1 (CD) MPEG-2 (DVD, DTV) MPEG-4 (WEB, Mobile) 내용 기술 (Content Description) MPEG-7 멀티미디어 프레임워크 MPEG-21 그 외 MPEG-A, B, C, D, E. MPEG-7 Multimedia Indexing and Searching. - PowerPoint PPT PresentationTRANSCRIPT
[The Band SIG]
MPEG7 - Audio
손우람2007 년 12 월 1 일
Why MPEG-7?
AcquisitionAuthoringEditing
Browsing
Navigation
FilteringManagement
TransmissionRetrievalStreaming
Coding
Compression
SearchingIndexing
MPEG-1,-2,-4
MPEG-7
MPEG standards
• 압축 (Compression)– MPEG-1 (CD)– MPEG-2 (DVD, DTV)– MPEG-4 (WEB, Mobile)
• 내용 기술 (Content Description)– MPEG-7
• 멀티미디어 프레임워크– MPEG-21
• 그 외– MPEG-A, B, C, D, E
User
N etwor k
M PE G- 7Pr oc essing
M PE G- 7S ear ch
Per vasiveU sage
E nvir onment
S ound s like ...L ooks like ...
D igit al M ed ia Respositor y
SemanticsQuery
MPEG-7S emanticsMPEG-7
DescriptorsMPEG-7Model
Model SimilarityQuery
Descript ions Descriptions
S earch
MPEG-7 Sear ch Engine(X ML Met adata)
MPEG-7SCHEMA
MPEG-7MetadataStorage
I BM Content Manager(Library Server &Object S erver)
MPEG-7 Multimedia Indexing and Searching
• MPEG-7 Indexing & Searching:
– Semantics-based (people, places, events, objects, scenes)
– Content-based (color, texture, motion, melody, timbre)
– Metadata (title, author, dates)
• MPEG-7 Access & Delivery:
– Media personalization– Adaptation & summarization– Usage environment (user
preferences, devices, context)
MPEG-7 MDS: Free Text Annotation Example
• The following example gives an MPEG-7 description of a car that is depicted in an image:<Mpeg7> <Description xsi:type="SemanticDescriptionType"> <Semantics> <Label> <Name> Car </Name> </Label> <Definition> <FreeTextAnnotation> Four wheel motorized vehicle </FreeTextAnnotation> </Definition> <MediaOccurrence> <MediaLocator> <MediaUri> image.jpg </MediaUri> </MediaLocator> </MediaOccurrence> </Semantics> </Description></Mpeg7>
오디오 부터…
Audio Fingerprint
장르 분류
• Genre Classification• …
Audio Visualization
Music Information Retrieval
• Content-based querying and retrieval• Automatic classification• Music recommendation and play-list
generation• Music summarization• Musical Feature Extraction
– Harmony, chord and tonality– Melody and motives– Rhythm, beat, tempo and form
MPEG 7 Audio
• Low-Level Descriptors• Description Schemes• Description Definition Language
(DDL)• BiM (Binary Format for MPEG-7)
What is Descriptor(D)?
• 정의– 오디오 특징 벡터 혹은 구성물의 의미
• Ex)– Audio Power– Audio Envelope– Audio Spectrum Flatness
Description Schemes (DSs)
• 정의– 쉽게 말해서 DS 의 집합
• 예 )– Instrument Timbre ( 악기 음색 )
• LogAtackTime• HarmonicSpectralCentroid• …
Description Definition Language (DDL)
• DS 와 DSs 를 정의하는 언어• XML 로 표현• …??...
Scalable Series
Original Series
Scaled Series
1 2 3 4 5 6 7 8Index i
2 3 1ratio
2 1 5numOfElements
12totalNumOfSamples
Scalar vs. Vector
Low-Level Descriptors
• Basic Descriptors• Basic Spectral Descriptors• Signal Parameter Descriptors• Timbral Temporal Descriptors• Timbral Spectral Descriptors• Spectral Basis Descriptors
오디오의 기본적 구성
• 시간 도메인 (Time Domain)0 Nw
Nhop
Hop size
Lw
L0
L1
L2
• N: index• S(n): signal• Fs: Sampling rate• L: index of time frames
Basic Descriptors
• Audio Waveform
• Audio Power
다음 시간에는…
• 돌아가며 Descriptor 하나씩 준비하기– 약 20-40 분– 각각의 Descriptors 의 내용을 추출하기 위한
알고리즘 생각하기– 코드로 구현해보기 ( 템플릿 코드 제작예정 )
• 각자 자유주제로 세미나– 약 10-20 분