how to use svm for data classification

63
如何用 SVM 做分類問題 Yiwei Chen 2016.10

Upload: yiwei-chen

Post on 15-Apr-2017

110 views

Category:

Software


1 download

TRANSCRIPT

Page 1: How to use SVM for data classification

如何用 SVM 做分類問題

Yiwei Chen2016.10

Page 2: How to use SVM for data classification

import numpy as npfrom sklearn import datasetsfrom sklearn.model_selection import GridSearchCVfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import MinMaxScalerfrom sklearn.svm import SVC

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

Page 3: How to use SVM for data classification

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 4: How to use SVM for data classification

如果看得懂前兩頁,就可以跳出這份投影片了

Page 5: How to use SVM for data classification

學習的方式很多

Page 6: How to use SVM for data classification

學習的目的也不同

notsweet

sweet

Page 7: How to use SVM for data classification

從經驗中學習冥冥之定數

Learn the Mother Naturefrom experience

Page 8: How to use SVM for data classification

這份投影片著重在

監督式分類(Supervised classification)

Page 9: How to use SVM for data classification

Mother Nature

甜 不甜 不甜 甜 ??

Page 10: How to use SVM for data classification

??

甜 / 不甜 ?

train甜/不甜?

model

甜 不甜 不甜 甜

Page 11: How to use SVM for data classification

??

甜 / 不甜 ?

predict

model

甜甜/不甜?

甜 不甜 不甜 甜

Page 12: How to use SVM for data classification

Supervised Classification

● 有 training data: 一些物品/事情 + 其類別 (classes)

● 你要訓練出一個模型 (train a model),之後

有新的物品進來,能預測 (predicts) 其類別

類別可以有兩個 (甜/不甜, binary classification) 或者更多個 (台/日/韓, multi-class classification)

Page 13: How to use SVM for data classification

Support Vector Machine (SVM)

● 有 training data: 向量 (vectors) + 其類別

● 你要訓練出一個模型 -- 為一個函數 (function),之後有新的向量進來,能預測其類別

類別可以有兩個 (甜/不甜, binary classification) 或者更多個 (台/日/韓, multi-class classification)

Page 14: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, -3)X

(2.4, 1, 0, 0, …, 22)O

(0.3, 0, 1, 0, …, 33)X

⋮⋮

Page 15: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

ƒ: →

model

O

(8.7, 1, 0, 0, …, -3)X

(2.4, 1, 0, 0, …, 22)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 16: How to use SVM for data classification

Feature engineering

● 用同樣方式,把物品轉成向量

● Size: 8cm or 80mm?● red/yellow/green: (1,0,0)/(0,1,0)/(0,0,1)

Page 17: How to use SVM for data classification

解決監督式分類問題有很多種方法

● SVM● Decision trees● Neural networks● Deep learning● …

他們可以解決監督式分類問題

不代表他們只能解決監督式分類問題

Page 18: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 19: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, 22)X

(2.4, 1, 0, 0, …, -3)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 20: How to use SVM for data classification

Support Vector Machine ??

例子: 二維的向量,兩個分類

Feature 1

Feature 2

train

Model (function)

Page 21: How to use SVM for data classification

Support Vector Machine ??

例子: 二維的向量,兩個分類

predict

Model

?

? Model

Page 22: How to use SVM for data classification

Maximum Margin

Page 23: How to use SVM for data classification

SVM 的性質

● 和距離相關 (Distance related)● 分越開越好 (Maximum margin)

Page 24: How to use SVM for data classification

Characteristics in SVM

● 和距離相關 (Distance related)● 分越開越好 (Maximum margin)● 參數化 (Parameterized)

○ 邊界有可能是彎的

○ 可以分錯,但要懲罰

Page 25: How to use SVM for data classification

用不同參數訓練,有不同結果 ...

Page 26: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 27: How to use SVM for data classification

用 python 的話

scikit-learn(sklearn)

numpy

SVM, decision trees,...

arrays, ... scipy

python

variance, ...

Page 28: How to use SVM for data classification

Anaconda: 願望一次滿足

● 跑在 python 上的開源科學平台

○ Linux / OSX / Windows● 想得到的都幫你安裝

● 快。不花腦。

● https://www.continuum.io/anaconda-overview

Page 29: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Page 30: How to use SVM for data classification

(1.2, 0, 0, 1, …, 57)

trainƒ: →

model

O

(8.7, 1, 0, 0, …, 22)X

(2.4, 1, 0, 0, …, -3)O

(0.3, 0, 1, 0, …, 33)X (1.2, 0, 1, …, 8)

predict

X

O

⋮⋮

Page 31: How to use SVM for data classification

一般流程

定好 評估公式+基礎預測

上線預測訓練

Page 32: How to use SVM for data classification

● Accuracy○ Training accuracy○ Testing accuracy

● precision, recall, Type I / Type II error, AUC, …

進行任何訓練前,先決定好你要怎麼評估結果!

評估 (Evaluation)

Page 33: How to use SVM for data classification

● Simple and easy, 閉著眼睛猜

● 拿來「比較」用(你知道你做的比Baseline還差嗎)

基礎的預測 (Baseline predictor)

train ALL

Page 34: How to use SVM for data classification

用 SVM 的流程

定好 評估公式+基礎預測

處理資料處理資料

縮放 features

尋找最好的參數

訓練模型

縮放 features

預測

Page 35: How to use SVM for data classification

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 36: How to use SVM for data classification

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 37: How to use SVM for data classification

1. Data preparation

● Transform object → vector● Whole training data at once

○ X in numpy.array (2-D) or scipy.sparse.csr_matrix○ y in numpy.array

(1.2, 0, 57)O

(8.7, 1, 22)X

(2.4, 1, -3)O X=np.array([[2.4, 1, -3], [8.7, 1, 22], [1.2, 0, 57]])

y=np.array([1,0,1])

Page 38: How to use SVM for data classification

2. Feature Scaling

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

0.3 ~ 10.3

(n−0.3) ×0.1

0 ~ 1

0 ~ 1

(n+0) ×1

0 ~ 1

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

Page 39: How to use SVM for data classification

2. Feature Scaling

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X)

Page 40: How to use SVM for data classification

3. Search for the best parameter

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}

grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)

grid.fit(X_scaled, y_train)

Page 41: How to use SVM for data classification

3. Search for best (??) C and

Page 42: How to use SVM for data classification

3. what is “best”?

甜 不甜 不甜 甜 ??

train

model

你還不知道

Page 43: How to use SVM for data classification

3. Search for the best - validation

train

model

當做新的,沒看過

validate

甜 不甜 不甜 甜

Page 44: How to use SVM for data classification

3. Search for the best - cross-validation

Cross-validation (CV): each fold validates in turn

train validate

train validate train

validate train

Given C=12, =34, the validation accuracy=0.56

Page 45: How to use SVM for data classification

3. Search for the best parameter - Grid

C

Page 46: How to use SVM for data classification

3. Search for the best parameter

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}

grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)

grid.fit(X_scaled, y_train)

Page 47: How to use SVM for data classification

4. Train Model

use the best parameter in CV to train

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 48: How to use SVM for data classification

Predict a novel data

● Scaling● Predict

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)

print(clf.predict(novel_X_scaled))

Page 49: How to use SVM for data classification

Scale Training Data

(1.2, 0, 0, …)O

(8.7, 1, 0, …)X

(2.4, 1, 0, …)O

(0.3, 0, 1, …)X

⋮⋮

0.3 ~ 10.3

(n−0.3) ×0.1

0 ~ 1

0 ~ 1

(n+0) ×1

0 ~ 1

(0.09, 0, 0, …)O

(0.84, 1, 0, …)X

O

(0 , 0, 1, …)X

⋮⋮

(0.21, 1, 0, …)

scale

Page 50: How to use SVM for data classification

Scale Testing Data

(2.3, 0, 0, …)O

(-0.7, 1, 1, …)X

(1.3, 1, 1, …)O

(100, 0, 0, …)X

⋮⋮

(n−0.3) ×0.1

(n+0) ×1

(0.20, 0, 0, …)O

(-0.1, 1, 1, …)X

O

(9.97, 0, 0, …)X

⋮⋮

(0.10, 1, 1, …)

scale

Page 51: How to use SVM for data classification

dataset = datasets.load_iris()X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, test_size=0.1, stratify=dataset.target)

scaler = MinMaxScaler()X_scaled = scaler.fit_transform(X_train)

param_grid = { "C": np.logspace(-5, 15, num=6, base=2), "gamma": np.logspace(-13, 3, num=5, base=2)}grid = GridSearchCV( estimator=SVC(kernel="rbf", max_iter=10000000), param_grid=param_grid, cv=5)grid.fit(X_scaled, y_train)

clf = SVC(kernel="rbf", C=grid.best_params_["C"], gamma=grid.best_params_["gamma"], max_iter=10000000)clf.fit(X_scaled, y_train)

Page 52: How to use SVM for data classification

novel_X = np.array([[5.9, 3.2, 3.9, 1.5]])novel_X_scaled = scaler.transform(novel_X)print(novel_X_scaled)print(clf.predict(novel_X_scaled))

X_test_scaled = scaler.transform(X_test)print(clf.predict(X_test_scaled))print(clf.score(X_test_scaled, y_test))

Page 53: How to use SVM for data classification

Agenda

● Supervised classification● Support Vector Machine● Software environment● Use Support Vector Machines

Takeaway…

Page 54: How to use SVM for data classification

??

甜 / 不甜 ?

train甜/不甜?

model

甜 不甜 不甜 甜

Page 55: How to use SVM for data classification

??

甜 / 不甜 ?

predict

model

甜甜/不甜?

甜 不甜 不甜 甜

Page 56: How to use SVM for data classification

用 SVM 的流程

Evaluation criteria + Baseline predictor

prepare dataprepare data

scale features

search best param:CV on grid

train model

scale features

predict

Page 57: How to use SVM for data classification

知道怎麼正確使用微波爐之後...

● Data collection (準備食材)● Model evaluation monitoring (客戶滿意?)● Feature engineering (處理食材)● Model update from novel data (與時俱進)● Training / prediction in large scale (大量食材)● A robust pipeline that integrates these altogether

(開餐廳)

Page 58: How to use SVM for data classification

Happy Training!

Page 59: How to use SVM for data classification

More materials

Page 60: How to use SVM for data classification

“Support” Vectors?

Page 61: How to use SVM for data classification

Maximum Margin

Page 62: How to use SVM for data classification

Why scaling?