multi-local feature manifolds for object detection oscar danielsson ([email protected]) stefan...
TRANSCRIPT
- Slide 1
- Multi-Local Feature Manifolds for Object Detection Oscar Danielsson ([email protected]) Stefan Carlsson ([email protected]) Josephine Sullivan ([email protected]) DICTA08
- Slide 2
- The Problem Object categories are often modeled by collections (bag-of-features) or constellations (pictorial structures) of local features Many simple, shape-based objects dont have any discriminative local appearance features ?
- Slide 3
- The Multi-Local Feature A specific spatial constellation of oriented edgels (or other local content) Captures global shape properties Weak detector of shape-based object categories Described by coordinate vector: (x 1,,x 12 )
- Slide 4
- Modeling Intra-Class Variation
- Slide 5
- 1. Generate coordinate vectors by clicking corresponding edgels in a (small) number of training images 2. Align coordinate vectors wrt. similarity transform
- Slide 6
- Modeling Intra-Class Variation 3. Extend coordinate vectors into their convex hull
- Slide 7
- Detection
- Slide 8
- For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 9
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 10
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 11
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 12
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 13
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 14
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 15
- Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
- Slide 16
- Experiments Detection performance was evaluated on a standard database (ETHZ Shape Classes) and we want to investigate: Is a multi-local feature a good weak detector? How many local features should be used?
- Slide 17
- Mugs - Training 3 1 8 10 149 7 1213 2 6 11 5 4 3 1 8 10 14 9 7 12 13 2 6 11 5 4 25 training images were downloaded from Google images 14 edgels constituting a multilocal feature were marked in each training image
- Slide 18
- Mugs - Results
- Slide 19
- Performance decreases when adding more than 9 local features 0.4 60.6 %
- Slide 20
- Bottles - Training 12 1 10 7 11 9 8 6 2 5 3 4 1 10 7 11 9 8 6 2 5 3 4 12 25 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image
- Slide 21
- Bottles - Results
- Slide 22
- 0.4 72.7 %
- Slide 23
- Apple logos - Training 20 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image
- Slide 24
- Apple logos - Results
- Slide 25
- Performance decreases when adding more than 11 local features 0.4 77.3 %
- Slide 26
- Conclusions A multi-local feature is a good weak detector of shape-based object categories The best performance is achieved with multi- local features with a moderate number of local features Convex combinations of valid exemplars are in general also valid exemplars (we can extend a few training examples into their convex hull)
- Slide 27
- Future Work Automatic learning of multi-local features Building combinations of multi-local feature detectors into an object detection system
- Slide 28
- Related Work Pictorial Structures E.g.. Felzenszwalb, Huttenlocher. Pictorial Structures for Object Recognition, IJCV No. 1, January 2005. Constellation Models E.g.. Fergus, Perona, Zisserman. Object class recognition by unsupervised scale-invariant learning, CVPR03. Differences Different detection methods Use rich local features
- Slide 29
- Thanks!
- Slide 30
- Representation The multi-local feature manifold consists of all convex combinations of the training examples