bundling features for large scale partial-duplicate web image search

33
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009 Citations: 163

Upload: win-yu

Post on 12-Jul-2015

461 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundling Features for Large Scale

Partial-Duplicate Web Image Search

Zhong Wu∗, Qifa Ke, Michael Isard,

and Jian Sun

CVPR 2009 Citations: 163

Page 2: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Outline

Introduction

Bundled features

Image Retrieval using bundled feature

Experiments and results

Conclusion

2

Page 3: Bundling Features for Large Scale Partial-Duplicate Web Image Search

INTRODUCTION

3

Page 4: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Target

ه Given a query image, is to locate its near- and

partial-duplicate images in a large corpus of web

images.

4

Page 5: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Novel Scheme

ه Each group of bundled features becomes

much more discriminative than a single

feature

ه within each group simple and robust

geometric constraints can be efficiently

enforced.

5

Page 6: Bundling Features for Large Scale Partial-Duplicate Web Image Search

BUNDLED FEATURES

6

Page 7: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Related Work

ه SIFT(Scale Invariant Feature Transform)ه keypoint & descriptor from the region centered at the

keypoint

ه MSER(Maximally Stable Extremal Region)ه Affine-covariant stable region + SIFT from the region

7

Page 8: Bundling Features for Large Scale Partial-Duplicate Web Image Search

8

Page 9: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundle Features

ه SIFT features: S = {sj}

ه MSER detections: R = {ri}

ه Define bundled feature B = {bi} :

bi = {sj|sj ∝ ri, sj ∈ S}

ه We discard any MSER detection whose ellipse spans more

than half the width or height of the image

9

Page 10: Bundling Features for Large Scale Partial-Duplicate Web Image Search

10

Page 11: Bundling Features for Large Scale Partial-Duplicate Web Image Search

IMAGE RETRIEVAL USING

BUNDLED FEATURE

11

Page 12: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Feature quantization

ه Hierarchical k-means

ه One million visual words from 50K training

images

12

Page 13: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه K-D tree

ه pointList = [(2,3), (5,4), (9,6), (4,7), (8,1),

(7,2)]

13

Page 14: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Matching bundled features

Let p = {pi} and q = {qj} be two bundled features with

quantized visual words pi, qj ∈ W

ه Define a matching score :

ه M(q; p) = Mm(q; p) + λMg(q; p)

ه where λ is a weighting parameter

14

Page 15: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه Membership term:

ه We simply use the number of common visual

words between two bundled features to define

the membership term Mm(q; p)

ه Mm(q; p) = |{pi}|

15

Page 16: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه Geometric term:

ه Our geometric term performs a weak geometric

verification between two bundled features p and

q using relative ordering:

ه

Indicator Function

16

Page 17: Bundling Features for Large Scale Partial-Duplicate Web Image Search

2

17

Page 18: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Indexing and retrieval

ه avoids storing and comparing high dimensional

local descriptors

ه reduces the number of candidate images

18

Page 19: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Indexing and retrieval

ه Voting

ه

19

Page 20: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Indexing and retrieval

ه tf

ه 100 vocabularies in a document, ‘a’ 3 times

ه 0.03 (3/100)

ه idf

ه 1,000 documents have ‘a’, total number of

documents 10,000,000

ه 9.21 ( ln(10,000,000 / 1,000) )

ه if-idf = 0.28( 0.03 * 9.21)20

Page 21: Bundling Features for Large Scale Partial-Duplicate Web Image Search

EXPERIMENTS AND RESULTS

21

Page 22: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Dataset

ه Basic datasetه One million images most frequently clicked in a

popular commercial image-search engine

ه (50K, 200K, 500K)

ه Ground truthه Manually labeled 780 partial-duplicate web image

form 19 groups.

ه Evaluation dataset = basic dataset + ground truth

ه Queryه 150 images from ground truth

22

Page 23: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Evaluation

ه Baseline

ه Bag-of-features approach with soft

assignment[13]

[13] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman.

Lost in quantization: Improving particular object retrieval in large scale image databases.

In CVPR, 2008.23

Page 24: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه Compare(HE)

ه enhance the with hamming embedding [3] by

adding a 24-bit hamming code to filter out

target features.

[3] H. Jegou, M. Douze, and C. Schmid.

Hamming embedding and weak geometric consistency for large scale image search.

In ECCV, 2008. 24

Page 25: Bundling Features for Large Scale Partial-Duplicate Web Image Search

baseline0.35 to

Bundled(mem)0.40

a 14% improvement

baseline0.35 to

Bundled 0.49

a 40% improvement

baseline0.35 to

Bundled+HE0.52

a 49% improvement

25

Page 26: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه Compare(Re-ranking)

ه Full geometric verification, RANSAC for top

300 candidate images

26

Page 27: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Baseline+re-rank 0.50 to

Bundled+re-rank 0.62

a 24% improvement

Baseline 0.35 to

Bundled+re-rank 0.62

a 77% improvement

27

Page 28: Bundling Features for Large Scale Partial-Duplicate Web Image Search

ه Trade-off

ه Run time

ه a single CPU on a 3.0GHz Core Duo desktop

with 16G memory

28

Page 29: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Sample results

Query Image

Baseline approach

Our approach

29

Page 30: Bundling Features for Large Scale Partial-Duplicate Web Image Search

30

Page 31: Bundling Features for Large Scale Partial-Duplicate Web Image Search

CONCLUSION

31

Page 32: Bundling Features for Large Scale Partial-Duplicate Web Image Search

Conclusion

ه Bundled features property

ه More discriminative than individual SIFT

features.

ه Simple and robust geometric constraints

ه Partially match two groups of SIFT features

ه Advantage

ه Robustness to occlusion, photometric and

geometric changes

32

Page 33: Bundling Features for Large Scale Partial-Duplicate Web Image Search

END

Thanks for your Listening