one-shot learning in semantic embedding and data...

61
[email protected] http://yanweifu.github.io One-shot Learning in Semantic Embedding and Data Augmentation 付彦伟 复旦大学大数据学院

Upload: others

Post on 04-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

[email protected] http://yanweifu.github.io

One-shot Learning in Semantic Embedding and Data Augmentation

付彦伟

复旦大学大数据学院

Page 2: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

One-shot Learning Object categorization

Fei-Fei et al. A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories. ICCV 2003

Fei-Fei, et al. One-Shot Learning of Object Categories. IEEE TPAMI 2006

One-shot Learning:

“learning object categories from just a few images,

by incorporating “generic” knowledge which may be obtained

from previously learnt models of unrelated categories”.

Page 3: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

One-shot Learning by Semantic Embedding

Fu, Y.; Hospedales, T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012;Fu, Y. ; Hospedales, T. ; Xiang, T. ; Gong, S. “Learning Multi-modal Latent Attributes” IEEE TPAMI 2014;Fu et al. Semi-supervised Vocabulary-informed Learning. (CVPR 2016, oral)Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear

Page 4: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Attribute Learning Pipeline

strips tails

Zebra horse mule lion

Lampert, C. H. Learning to detect unseen object classes by between-class attribute transfer. CVPR 2009

Page 5: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Semantic Attributes in Zero/One-shot Learning

Fu, Y.; Hospedales, T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012;Fu, Y. ; Hospedales, T. ; Xiang, T. ; Gong, S. “Learning Multi-modal Latent Attributes” IEEE TPAMI 2014;

Page 6: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Learning Multi-modal Latent Attributes

Fu, Y.; Hospedales, T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012;Fu, Y. ; Hospedales, T. ; Xiang, T. ; Gong, S. “Learning Multi-modal Latent Attributes” IEEE TPAMI 2014;

Page 7: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Experimental Settings

Dataset & Settings:

• USAA dataset (4 source cls, 4 target cls, multiple round class splits);

• Animal with Attributes (AwA) dataset (40 source cls; 10 target cls);

Comparisons

• Direct: KNN/SVM of features to classes;

• DAP: Direct Attribute Prediction [Lampert et al. CVPR 2009];

• SVM-UD: an SVM generalization of DAP;

• SCA: Topic models in [Wang et al CVPR 2009];

• ST: Synthetic Transfer in [Yu et al ECCV 2010];

Page 8: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Unstructured Social Activity Dataset (USAA)

Birthday party GraduationMusic

performanceNon-music

performanceParade

Wedding

ceremony

Wedding

dance

Wedding

reception

Page 9: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

One-shot Learning Results

For more results, please check our papers.

Page 10: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Vocabulary-informed Learning

Fu et al. Semi-supervised Vocabulary-informed Learning. (CVPR 2016, oral)Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear

Page 11: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Supervised Learning

airplane

car

unicycle tricycle

Semantic labels Visual feature space

Page 12: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

airplane

unicycle tricycle

car

One-shot Learning

Semantic labels Visual feature space

Page 13: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Zero/One-shot Learning by Semantic Embedding (Problem Definition)

truck

bicycle

Zero/one-shot Learning: We have zero/one instances visually labeled instances of what these look like.

Semantic labels Visual feature space

Page 14: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

truck

bicycle

airplane

unicycle

tricycle

car

truck

bicycle

Semantic labels Visual feature space

Learning

Page 15: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Inference

truck

bicycle

airplane

unicycle

tricycle

car

truck

bicycle

Key Question: How do we define semantic space?

Page 16: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Semantic Label Vector Spaces

Spaces Type Advantages Disadvantages

Semantic Attributes

SupervisedGood interpretability of each dimension:

Manual annotation

Limited vocabulary

Semantic Word Vectors(e.g. word2vec)

Unsupervised

Good vector representation for millions of

vocabularyLimited interpretability of each dimension

Page 17: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei
Page 18: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Vocabulary-Informed Recognition

unicycle

tricycle

Image

Fu et al. Semi-supervised Vocabulary-informed learning, CVPR 2016 (Oral)

Page 19: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Estimating Density of Classes in the Space

Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear

Margin distribution of prototypes in the semantic space

The knowledge of margin distribution of instances, rather than a single margin across all instances, is crucial for improving the generalization performance of a classifier.

Instance margin: the distance between one instance and the separating hyperplane. The distribution for the minimal values of the margin distance is characterized by a Weibull distribution

The probability of 𝑔(𝑥) included in the boundary estimated by 𝑔(𝑥𝑖)

Margin Distribution of Prototypes:

Coverage Distribution of Prototypes.

Extreme Value Theorem

Page 20: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Experimental Dataset and Tasks

Dataset:

• AwA dataset:

• ImageNet 2012/2010 dataset.

We can address following tasks by learning semantic embedding,

• SUPERVISED recognition

• ZERO-SHOT recognition

• GENERAL-ZERO-SHOT recognition

• ONE-SHOT recognition

• OPEN-SET recognition

Page 21: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Experimental Settings of Few-shot Learning

• Learning Classifiers from Few Source Training Instances

• Source classes: One-shot Recognition

• Target classes: Zero-shot Recognition

• Key insights: leveraging the knowledge from semantic space

(vocabulary-informed)

• Few-shot Target Training instances

• Few-shot setting, consistent with general definition

Page 22: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Results on Few-shot Learning

Few-shots on source dataset

Page 23: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Results on Few-shot Learning

Page 24: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

One-shot learning aims to learn information about object categories from one, or only a few, training images.

Data-Augmentation Meta-Learning

Meta Augmentation Learning

One-shot Learning by Data Augmentation

Page 25: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Multi-level Semantic Feature Augmentation for One-shot Learning

Zitian Chen, Yanwei Fu, Yinda Zhang, Yu-Gang Jiang, Xiangyang Xue, and Leonid Sigal. IEEE Transaction on Image Processing (TIP) 2019

Page 26: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Motivation

• A straight forward way to tackle one-shot learning is data augmentation• We want to utilize semantic space• Related concepts in the semantic space help to learn

Image Feature Space Semantic Feature Space

Antelopes

Killer whale

Beaver

Mountain goatWhale

Orca

Sea lion

Muskrat

Woodchuck

Badger

Hartebeest

Pronghorn

Help?

Page 27: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Method

Image Feature Space Semantic Feature Space

Antelopes

Killer whale

Beaver

Mountain goatWhale

Orca

Sea lion

Muskrat

Woodchuck

Badger

Hartebeest

Pronghorn

𝑓(𝑥)

𝑔(𝑥)

Page 28: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Single-level

• But we want to utilize different level visual concepts.

Page 29: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Multi-level

• Use High-level feature and low-level feature help to encode• Decode semantic feature to different level feature diversify the augmented features

Page 30: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Visualization

Page 31: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Image Deformation Meta-Networks for One-Shot Learning

Zitian Chen, Yanwei Fu, Yu-Xiong Wang,

Lin Ma, Wei Liu, Martial Hebert

Page 32: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

The Basic Idea of Jigsaw Augmentation Method

Image Block Augmentation for One-Shot Learning. Zitian Chen, Yanwei Fu, Kaiyu Chen, Yu-Gang Jiang. AAAI 2019

Page 33: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

33

Visual contents from other images may be helpful to synthesize new images

Page 34: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

34

Ghosted Stitched

Montaged Partially occluded

Human can learn novel visual concepts even when images undergo various deformations

Page 35: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Deformed Images

Visual contents from other images might be helpful

Page 36: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

36

Approach

Page 37: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

37

Motivation1.Visual contents from other images may be helpful to

synthesize new images.2.Human can learn novel visual concepts even when images

undergo various deformations.

ApproachWe design a deformation sub-network that learns to deform images by fusing a pair of images — a probe image that keeps the visual content and a gallery image that diversifies the de-formations.

Page 38: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Probe Image

Gallery Image

find visuallysimilar

ANET

BNET

Concat

Probe Image

Gallery Image

Embedding Sub-NetworkDeformation Sub-Network

Page 39: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

39

50

55

60

65

70

75

1-Shot 5-Shot

Top-1 accuracies(%) on miniImagenet

Baseline

Ours

50

55

60

65

70

75

1-Shot 5-Shot

Top-1 accuracies(%) on miniImagenet

Baseline

Ours

Page 40: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

40

Gaussian Oursreal probe image

deformed image

real image

40

Page 41: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

NeurIPS 2019

Page 42: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Hawk

source: https://birdeden.com/distinguishing-between-hawks-falcons

Falcon

Page 43: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Fine-grained Visual Recognition

• Much harder than normal classification.

• Difficult to collect data.

• Can’t use crowdsourcing.

• Need expert annotator.

• Demand one-shot learning.

Page 44: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Can we generate more data?

• How about state-of-the-art GANs?

• Challenge: GAN training itself need a lot of data.

Page 45: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Our Idea: Fine-tune GANs trained on ImageNet.

BigGANZ

Z

One Million General Images

?A Specific Image

Transfer generative knowledge from one million general images to a domain specific image.

Page 46: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Fine-tune BigGAN with a single image

Original Generated

Page 47: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Technical Point: Fine-tune Batch Norm Only

Original Fine-Tune All Fine-Tune BatchNorm

Page 48: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Our idea: Meta-Augmentation Learning

Image Fusion Net F

Fusing Weight 𝑤

Original: 𝐼 Generated: 𝐺(𝐼)Fused: 𝑤𝐼 + (1 − 𝑤)𝐺(𝐼)

Use meta-learning to learn the best mixing strategy to help one-shot classifiers.

Learning to reinforce with the original image

Page 49: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Examples

Page 50: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Our method has consistent improvement.

Page 51: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Embodied One-Shot Video Recognition:

Learning from Actions of a Virtual Embodied Agent

Yuqian Fu, Chengrong Wang, Yanwei Fu, Yu-Xiong Wang,

Cong Bai, Xiangyang Xue, Yu-Gang Jiang

ACM Multimedia 2019

Page 52: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

One-Shot Learning Setting Revisited

“shooting basketball”

“running”

- Quite similar video clips may appear in both source and target classes.

Source Domain

Target Domain

P1D-09

Page 53: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Embodied One-Shot Video Recognition

Page 54: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Learning from Actions of a Virtual Embodied Agent

Virtual Environment

Virtual Embodied Agent

Virtual Action Videoshttps://www.unrealengine.com/marketplace/en-US/store

- Learning from actions of virtual embodied agents to address the limitations.

P1D-09

Page 55: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

break dancing

throwing

waving hand

Real Target Data Virtual Source Data

http://www.sdspeople.fudan.edu.cn/fuyanwei/dataset/UnrealAction/

UnrealAction Dataset- 14 action classes.

- each class has 100 virtual videos and 10 real videos.

P1D-09

Page 56: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Classical One-shot Recognition

a b c d

Embodied

One-Shot

Recognition

Domain

Adaptation

Transfer

Recognition

A B

A B

A B

E F

Embodied One-Shot Video Recognition

P1D-09

Page 57: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Action Label : c Action Label : c

Probe Video Gallery Video Segment Augmented Video

Video Segment Augmentation Method

- Subliminal advertising experiments.

- Augmenting videos by replacing short segments.

P1D-09

Page 58: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Method

Page 59: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Video Segment Augmentation Method

CNN

model

Probe segments in 𝑉𝑝𝑟𝑜𝑏𝑒segment-level

feature extractor

Gallery segments in 𝐺𝑝𝑜𝑜𝑙

𝐺1

𝐺𝑘

𝐺𝑛

𝐺2

𝑃1

𝑃𝑚

𝑃2

semantic

correlation

scores matrix

slide

window

𝑦𝑘,1 𝑦𝑘,2 𝑦𝑘,3 𝑦𝑘,𝑚…

[ 𝜆1, 𝜆2, 𝜆1 ]

𝑦𝑘,2′

𝑦𝑘,1′

𝑦𝑘,3′ 𝑦𝑘,𝑚

slide*

𝑓θ ( 𝐺𝑘)

∎ ∎ …∎∎ ∎ ∎∎ ∎ … ∎

∎ ∎ … ∎

𝐹θ ( 𝐺𝑝𝑜𝑜𝑙)

𝐹θ ( 𝑉𝑝𝑟𝑜𝑏𝑒)

d ( , )

CNN

model

P1D-09

Page 60: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

sample

Framework

Training

Testing

𝐷𝑛𝑜𝑣𝑒𝑙

stage1

Video Segment

Augmentation

n-way-k-shot

1 query video

𝐷𝑏𝑎𝑠𝑒

fine-tuning

stage2

fine-tuning

Video Segment

Augmentation

𝐺𝑝𝑜𝑜𝑙

ProtoNet

feature extractor

P1D-09

Page 61: One-shot Learning in Semantic Embedding and Data Augmentationvalser.org/webinar/slide/slides/20191204/one_shot_learning_valse_w… · One-shot Learning Object categorization Fei-Fei

Thanks very much!

[email protected]