chang wangchang wang, sridhar mahadevansridhar mahadevan

15
Heterogeneous Domain Adapation using Manifold Alignment Chang Wang , Sridhar mahadevan

Post on 19-Dec-2015

230 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Heterogeneous Domain Adapation using Manifold Alignment

Chang Wang, Sridhar mahadevan

Page 2: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Layout

Problem Introduction Problem Definition Who cares Previous work & challenges Contribution A glance at methods

Page 3: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Problem Introduction

Example problem Input: Three collections of documents in

English (sufficient labels) Italian (sufficient labels) Arabic (few labels).

Target: Assign labels to the Arabic documents. A way: find a common feature space for 3 domains

Shared labels, (sports, military) No shared documents. (no instance

correspondence) No words translations are available.

Page 4: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Problem Introduction

English docs

Italian Docs

Arabic docs

Shared label set: {sports, military}No corresponding instances or words

Question: Can we construct a commonfeature space so that we can use English docs and Italian docs to help classify Arabic docs?

doc word1 word2 … label

1 0 2 sports

2 2 0 military

3 1 0 military

4 0 1 sports

doc parola1 parola2 … etichetta

1 2 0 sports

2 0 2 military

3 0 1 military

4 1 0 sports

doc كلمة1 كلمة2 … ملصق

1 0 2 sports

2 2 0 military

3 1 0 ?

4 0 1 ?

Page 5: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Problem Introduction

English docs

Italian docs

Arabic docs

doc word1 word2 … label

1 0 2 sports

2 2 0 military

3 1 0 military

4 0 1 sports

doc parola1 parola2 … etichetta

1 2 0 sports

2 0 2 military

3 0 1 military

4 1 0 sports

doc كلمة1 كلمة2 … ملصق

1 0 2 sports

2 2 0 military

3 1 0 ?

4 0 1 ?

doc feature1 feature2 … label

1 0 2 sports

2 2 0 military

3 1 0 military

4 0 1 sports

5 0 2 sports

6 2 0 military

7 1 0 military

8 0 1 sports

9 0 2 sports

10 2 0 military

11 1 0 ?

12 0 1 ?

Common feature space

Page 6: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Problem Introduction

Given K input datasets in different domains, with different features, but all of the datasets shared the same label set.

Source domain have sufficient labeled instances.

Target domain have few labeled instances.

Question: Can we construct a common feature space? So all instances in different domain can be mapped

to the same feature space, so that we can perform learning task?

Page 7: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Source k

Problem Definition

Source 1

Target

𝑚1

𝑝1

𝑚𝑘

𝑝𝑘

𝑚𝑡

𝑝𝑡

Common feature space∑

𝑖=1

𝑘

𝑚𝑖

𝑑

: # instances (domain i) : # features (domain i) : dimension of common feature space

Learning

Page 8: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Problem Definition

Input: K datasets from different domain : dataset k : instance i in dataset k is defined by feature

Goal: construct dimension common feature space for learning

Output: k mapping functions, , matrix

Page 9: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Who may benefit?

Search engine classify docs, rank docs, find docs topics

Businessman Customer clustering

Biologist Match protein

Page 10: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Challenges

Target domain have little labels No instance correspondence Source domain and target domain

have different feature space

Page 11: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Previous work

Most work assumes that the source domain and the target domain have the same features.

Manifold regularization Do not leverage source domain information

Transfer learning based on manifold alignment: use both label and unlabeled instance to learn mapping require small amount of instance

correspondence.

Page 12: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

Contribution

Transfer learning perspective Can work on different feature space Cope with multiple input domain Can combine with existing domain

adaption methods Manifold alignment perspective

Need no instance correspondence Use label to learn alignment

Page 13: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

A glance at methods

Find a set of mapping functions matrix

3 Criterions Instances from the same class (across

domains) are mapped to similar locations

Instances from different class (across domains) are mapped to separate locations

Preserve topology in the original domain.

Page 14: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

A glance at methods

English docs

Italian docs

Arabic docs

doc word1 word2 … label

1 0 2 sports

2 2 0 military

3 1 0 military

4 0 1 sports

doc parola1 parola2 … etichetta

1 2 0 sports

2 0 2 military

3 0 1 military

4 1 0 sports

doc كلمة1 كلمة2 … ملصق

1 0 2 sports

2 2 0 military

3 1 0 ?

4 0 1 ?

doc feature1 feature2 … label

1 0 2 sports

2 2 0 military

3 1 0 military

4 0 1 sports

5 0 2 sports

6 2 0 military

7 1 0 military

8 0 1 sports

9 0 2 sports

10 2 0 military

11 1 0 ?

12 0 1 ?

Common feature space

Minimise distance(1,5)=0

Maximise distance(1,6)=

Minimise distance(10,11)=1

Page 15: Chang WangChang Wang, Sridhar mahadevanSridhar mahadevan

A glance at methods

Encode 3 criterion in a cost function Minimize

, for any pair with the same label , for any pair with different labels *similarity(, ), for any pair in one

original domain