birdsong recognition 鳥類鳴聲辨識

Post on 16-Jan-2016

52 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Birdsong Recognition 鳥類鳴聲辨識. 李 建 興 中華大學資訊工程學系教授. - PowerPoint PPT Presentation

TRANSCRIPT

1

Birdsong Recognition鳥類鳴聲辨識

李 建 興中華大學資訊工程學系教授

2

Automatic Classification of Bird Species From Their Sounds Using Two-

Dimensional Cepstral Coefficients

Chang-Hsing Lee, Chin-Chuan Han, and Ching-Chien ChuangIEEE Trans. on Audio, Speech, and Language Processing,

Vol. 16, No. 8, Nov. 2008, pp. 1541-1550.

3

System Framework

Training syllable

Feature Database

Feature Extraction

LDA

Prototype Vectors Generation

PCA

Classified Bird Species sc

Test syllable

Feature Extraction

LDA Transformation

Classification

PCA Transformation

4

Feature Extraction

Two-dimensional Mel-frequency cepstral coefficient (TDMFCC)

Time

MFCC

Time

MFCC

DCT TDMFCC

5

Feature Extraction (cont.)

Dynamic Two-dimensional MFCC ( DTDMFCC )

0

0

0

2

1

))()(()(

n

nn

n

nnini

i

n

jEjEnja

6

Prototype Vector Generation

Gaussian mixture model (GMM) vs. Vector quantization (VQ)

Acoustic Model Selection – Bayesian information criterion (BIC)

Component Number Selection – self-splitting Gaussian mixture learning (SGML)

7

Experimental Results

28 bird species Training set – 3143 syllables

Yushan National Park, CD Sound of the Mountain IV: The songs of Wild Birds

Yushan National Park, CD Sound of the Mountain V: The songs of Wild Birds

Test set – 646 syllables Downloaded from website of National Fonghuanggu

Bird Park

8

Experimental Results (cont.)

Comparison of classification results for different PCA threshold

9

Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ

OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97

Subject Code Bird Name CA (%) Ns Selected Model

1 Crested Serpent Eagle 100.00 2 EVQ

2 Bronzed Drongo 86.49 5 EVQ

3 Gray-headed Pygmy Woodpecker 0.00 1 EVQ

4 Blue Shortwing 72.41 4 EVQ

5 Streak-breasted Scimitar Babbler 54.55 3 GMM

6 Taiwan Firecrest 100.00 3 EVQ

7 Taiwan Sibia 100.00 6 EVQ

8 White-throated Laughing Thrush 94.59 3 EVQ

9 White-breasted Water Hen 100.00 4 EVQ

10 Beavan's Bullfinch 100.00 3 EVQ

11 Gray-sided Laughing Thrush 100.00 3 EVQ

12 Alpine Accentor 71.70 1 EVQ

13 Green-backed Tit 7.14 5 EVQ

14 Taiwan Yuhina 100.00 3 EVQ

10

Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ

OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97 (cont.)

Subject Code Bird Name CA (%) Ns Selected Model

15 Red-headed Tit 100.00 2 EVQ

16 Collared Bush Robin 94.44 9 EVQ

17 Taiwan Bulbul 83.33 5 EVQ

18 Taiwan Hill Partridge 88.89 6 EVQ

19 Verreaux's Bush Warbler 100.00 4 EVQ

20 Oriental Cuckoo 95.56 3 GMM

21 Taiwan Tit 96.30 7 EVQ

22 Vivid Niltava 100.00 5 EVQ

23 Coal Tit 100.00 4 EVQ

24 Crested Goshawk 100.00 3 EVQ

25 Gould's Fulvetta 33.33 1 EVQ

26 Collared Pigmy Owlet 100.00 1 EVQ

27 Swinhoe's Pheasant 100.00 3 EVQ

28 Steere's Liocichla 80.00 3 EVQ

11

Continuous Birdsong Recognition Using Gaussian Mixture Modeling of

Image Shape Features

Chang-Hsing Lee, Sheng-Bin Hsu, Jau-Ling Shih, and Chih-Hsun Chou

IEEE Trans. on Multimedia, Vol. 15, No. 2, Feb. 2013, pp. 454-463.

12

System Framework

13

Feature Extraction• Angular Radial Transformation (ART) Feature

14

Feature Extraction (cont.)

Step 1: Spectrogram Generation

Zoom in

Music wave form :

Frame

Overlap

Spectrum analysis

15

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

frame decomposition

frequency

16

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

Waveform

Spectrogram

17

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

鳳頭蒼鷹(Crested Goshawk)

火冠戴菊鳥 (Taiwan Firecest)

白耳畫眉(Taiwan Sibia)

黃腹琉璃(Vivid

Niltava)

18

Feature Extraction (cont.)

Step 2: Recognition window segmentation

19

Feature Extraction (cont.)

Step 3: Sector image generation

20

Feature Extraction (cont.)

Step 3: Sector image generation (cont.)

uu 256vv 256 sinu

cosv

f256

2562

t

2222 )256()256(256)()(256256 vuvuf

256

256tan

2

256tan

2

256

2

256 11

v

u

v

ut

21

Feature Extraction (cont.)

Step 4: ART feature extraction

Vn,m(ρ, θ): the ART basis function of order n and m, which is separable along the angular and radial directions:

where

2

0

1

0 ,, ),(),(),(),,(),( ddIVIVmnF SmnSmn

)()(),(, nmmn RAV

jm

m eA2

1)(

0)cos(2

01)(

nn

nRn

22

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

The 1212 (N = 12 and M = 12) complex ART basis functions (a) real parts of ART basis functions (b) imaginary parts of ART basis functions

23

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

24

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

Experimental ResultsCOMMON AND LATIN NAME OF BIRD SPECIES IN THE BIRDSONG DATABASE AND THE NUMBER OF BIRDSONG SEGMENTS IN THE TRAINING SET (NTr) AND TEST SET (NTe) FOR BIRDSONG SEGMENTS OF DIFFERENT DURATIONS (D)

Common Name Latin NameD = 3 seconds D = 5 seconds

NTr NTe NTr NTe

Crested Serpent Eagle Spilornis cheela 107 5 105 3

Bronzed Drongo Dicrurus aeneus 128 10 126 8

Gray-headed Pygmy Woodpecker Dendrocopos canicapillus 50 9 48 7

Blue Shortwing Brachypteryx montana 172 6 170 4

Streak-breasted Scimitar Babbler Pomatorhinus ruficollis 147 16 145 4

Taiwan Firecest Regulus goodfellowin 92 10 90 8

Taiwan Sibia Heterophasia auricularis 97 5 95 3

White-throated Laughing Thrush Garrulax albogularis 61 8 59 6

White-breasted Water Hen Amauromis phoenicurus 83 6 81 4

Beavan's Bullfinch Pyrrhula erythaca 104 3 102 1

Gray-sided Laughing Thrush Garrulax caerulatus 77 79 75 77

Alpine Accentor Prunella collaris 62 9 60 7

Green-backed Tit Parus monticolus 127 4 125 2

Taiwan Yuhina Yuhina brunneiceps 62 6 60 425

Experimental Results (cont.)Red-headed Tit Aegithalos concinnus 98 9 96 7

Collared Bush Robin Erithacus johnstoniae 147 5 145 3

Taiwan Bulbul Pycnonotus taivanus Styan 58 8 56 6

Taiwan Hill Partridge Arborophila crudigularis 141 10 139 8

Verreaux's Bush Warbler Cettia acanthizoides 72 8 70 6

Oriental Cuckoo Cuculus saturatus 124 10 122 8

Taiwan Tit Parus holsti 116 7 114 5

Vivid Niltava Niltava vivida 91 8 89 6

Colal Tit Parus ater 105 10 103 8

Crested Goshawk Accipiter trivirgatus 113 11 111 9

Gould's Fulvetta Alcippe brunnea 41 7 39 5

Collared Pigmy Owlet Glaucidium brodiei 59 16 57 9

Swinhoe's Pheasant Lophura swinhoii 92 5 90 3

Steere's Liocichla Liocichla steerii 57 6 55 4

Red-headed Tit Aegithalos concinnus 98 9 96 7

Collared Bush Robin Erithacus johnstoniae 147 5 145 3

Total number of birdsong segments 2683 296 2627 22526

27

Experimental Results (cont.)

Comparison of classification accuracy for different number of GMM Gaussian components (G) and distinct PCA thresholds () using 624 ART basis

functions for the recognition of birdsong segments having distinct durations (D)

28

Experimental Results (cont.)

Comparison of classification accuracy on distinct ART basis functions (NM) for the classification of birdsong segments having different durations (D) with

fixed number of GMM component (G = 5)

29

Experimental Results (cont.)

COMPARISON OF VARIOUS FEATURE DESCRIPTORS IN TERMS OF CLASSIFICATION ACCURACY (CA)

DescriptorD = 3 D = 5

CA (%) (G, ) CA (%) (G, )

LPCC 30.41 (50, 0.98/0.99) 40.00 (30, 0.99)

MFCC 46.62 (35, 0.98/0.99) 56.89 (45, 0.95/0.96/0.97)

TDMFCC 69.86 (10, 0.96) 77.13 (5, 0.95)

DTDMFCC 76.03 (5, 0.99) 83.86 (10, 0.99)

SDTDMFCC 73.63 (10, 0.95) 79.82 (10, 0.95/0.96)

ART 86.30 (5, 0.97/0.98) 94.62 (5, 0.95/0.97)

30

Thanks!

top related