bundling features for large scale partial-duplicate web image search
Post on 12-Jul-2015
461 Views
Preview:
TRANSCRIPT
Bundling Features for Large Scale
Partial-Duplicate Web Image Search
Zhong Wu∗, Qifa Ke, Michael Isard,
and Jian Sun
CVPR 2009 Citations: 163
Outline
Introduction
Bundled features
Image Retrieval using bundled feature
Experiments and results
Conclusion
2
INTRODUCTION
3
Target
ه Given a query image, is to locate its near- and
partial-duplicate images in a large corpus of web
images.
4
Novel Scheme
ه Each group of bundled features becomes
much more discriminative than a single
feature
ه within each group simple and robust
geometric constraints can be efficiently
enforced.
5
BUNDLED FEATURES
6
Related Work
ه SIFT(Scale Invariant Feature Transform)ه keypoint & descriptor from the region centered at the
keypoint
ه MSER(Maximally Stable Extremal Region)ه Affine-covariant stable region + SIFT from the region
7
8
Bundle Features
ه SIFT features: S = {sj}
ه MSER detections: R = {ri}
ه Define bundled feature B = {bi} :
bi = {sj|sj ∝ ri, sj ∈ S}
ه We discard any MSER detection whose ellipse spans more
than half the width or height of the image
9
10
IMAGE RETRIEVAL USING
BUNDLED FEATURE
11
Feature quantization
ه Hierarchical k-means
ه One million visual words from 50K training
images
12
ه K-D tree
ه pointList = [(2,3), (5,4), (9,6), (4,7), (8,1),
(7,2)]
13
Matching bundled features
Let p = {pi} and q = {qj} be two bundled features with
quantized visual words pi, qj ∈ W
ه Define a matching score :
ه M(q; p) = Mm(q; p) + λMg(q; p)
ه where λ is a weighting parameter
14
ه Membership term:
ه We simply use the number of common visual
words between two bundled features to define
the membership term Mm(q; p)
ه Mm(q; p) = |{pi}|
15
ه Geometric term:
ه Our geometric term performs a weak geometric
verification between two bundled features p and
q using relative ordering:
ه
Indicator Function
16
2
17
Indexing and retrieval
ه avoids storing and comparing high dimensional
local descriptors
ه reduces the number of candidate images
18
Indexing and retrieval
ه Voting
ه
19
Indexing and retrieval
ه tf
ه 100 vocabularies in a document, ‘a’ 3 times
ه 0.03 (3/100)
ه idf
ه 1,000 documents have ‘a’, total number of
documents 10,000,000
ه 9.21 ( ln(10,000,000 / 1,000) )
ه if-idf = 0.28( 0.03 * 9.21)20
EXPERIMENTS AND RESULTS
21
Dataset
ه Basic datasetه One million images most frequently clicked in a
popular commercial image-search engine
ه (50K, 200K, 500K)
ه Ground truthه Manually labeled 780 partial-duplicate web image
form 19 groups.
ه Evaluation dataset = basic dataset + ground truth
ه Queryه 150 images from ground truth
22
Evaluation
ه Baseline
ه Bag-of-features approach with soft
assignment[13]
[13] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman.
Lost in quantization: Improving particular object retrieval in large scale image databases.
In CVPR, 2008.23
ه Compare(HE)
ه enhance the with hamming embedding [3] by
adding a 24-bit hamming code to filter out
target features.
[3] H. Jegou, M. Douze, and C. Schmid.
Hamming embedding and weak geometric consistency for large scale image search.
In ECCV, 2008. 24
baseline0.35 to
Bundled(mem)0.40
a 14% improvement
baseline0.35 to
Bundled 0.49
a 40% improvement
baseline0.35 to
Bundled+HE0.52
a 49% improvement
25
ه Compare(Re-ranking)
ه Full geometric verification, RANSAC for top
300 candidate images
26
Baseline+re-rank 0.50 to
Bundled+re-rank 0.62
a 24% improvement
Baseline 0.35 to
Bundled+re-rank 0.62
a 77% improvement
27
ه Trade-off
ه Run time
ه a single CPU on a 3.0GHz Core Duo desktop
with 16G memory
28
Sample results
Query Image
Baseline approach
Our approach
29
30
CONCLUSION
31
Conclusion
ه Bundled features property
ه More discriminative than individual SIFT
features.
ه Simple and robust geometric constraints
ه Partially match two groups of SIFT features
ه Advantage
ه Robustness to occlusion, photometric and
geometric changes
32
END
Thanks for your Listening
top related