poisoningattacksvm (icmlreading2012)

Poisoning Attacks against Support Vector Machines

ICML読み会 2012/07/28Hidekazu Oiwa (@kisa12012)oiwa (at) r.dl.itc.u-tokyo.ac.jp

112年7月28日土曜日

読む論文• Poisoning Attacks against Support Vector Machines• Battista Biggio (Itary), Blaine Nelson, Pavel Laskov (German)

• http://icml.cc/2012/papers/880.pdf (論文)• http://www.slideshare.net/pragroup/battista-biggio-icml2012-poisoning-attacks-against-support-vector-machines (スライド)

• 第一著者がDr. Laskovの元へ約半年visitingしてた時の論文• Adversarial Classificationの研究者


http://icml.cc/2012/papers/880.pdf

http://icml.cc/2012/papers/880.pdf

http://www.slideshare.net/pragroup/battista-biggio-icml2012-poisoning-attacks-against-support-vector-machines






目次• 研究の概要

• Poisoning Attacksとは？

• 問題設定

• 提案アルゴリズム

• Poisoning Attacks against SVMs

• カーネルSVMへの拡張

• 実験

• 人工データ実験

• 手書き文字認識実験


研究概要


背景• （大規模）機械学習流行中• Malicious Behavior : 悪意あるエージェントの行動• 分類器や異常検知器を混乱させる様に動く• Ex. スパムフィルタリング・マルウェア解析

• 目標：Malicious Behaviorに頑健なアルゴリズム• そのためには…• Malicious Behaviorの性質の分析が求められる


Malicious Behaviorの分類[Barreno+ ML10]

• Causative Attack

• 設計者が持っている訓練データを直接操作，書き換え

• Exploratory Attack

• 設計者が持っている分類器を直接操作，書き換え

• これらに対応したアルゴリズムは，すでに複数提案されている

• Poisoning Attack

• 設計者の訓練データに新しく悪性データを注入

• 他の手法と比べて，より現実的な攻撃方法

• 設計者のデータベースを直接弄る必要がないため

• 異常検知系の先行研究しかない [Kloft+ AISTATS10]+


Poisoning Attack

SVM

訓練データ


Poisoning Attack

SVM

訓練データ

性能劣化712年7月28日土曜日

問題設定

validation settraining set

設計者攻撃者

SVMは悪性データを含めたtraining setから学習

ycは予め固定Validation Setに対する分類性能を

一番押し下げるを生成

悪性データ

Dtr = {xi, yi}ni=1 Dval = {xk, yk}mk=1

xc

(xc, yc)


本研究の概要• SVMへのPoisoning Attacksの解析• 悪性データ作成アルゴリズムの提案• Incremental SVM• カーネル拡張• 人工データ実験• 手書き文字認識データ実験


提案アルゴリズム


最適化問題

• Validation Setにおける損失を最大化• : 悪性データ込みで学習されたSVM• 非凸な最適化問題• 解法：Gradient Ascent• SVMの更新と悪性データの更新を繰り返す• ステップ幅を適切に設定すれば局所最適解に収束

fxc(·)

max

xc

L(x

c

) =

X

k

[1� y

k

f

xc(xk

)]+ =

X

k

�g

k

(x

c

)

x

0c = xc + t · u

u / rL(xc)


アルゴリズムの概要Poisoning Attacks against SVMs

Algorithm 1 Poisoning attack against SVMInput: Dtr, the training data; Dval, the validation

data; yc

, the class label of the attack point; x(0)c

, theinitial attack point; t, the step size.Output: x

c

, the final attack point.

1: {↵i

, b} learn an SVM on Dtr.2: k 0.3: repeat

4: Re-compute the SVM solution on Dtr[{x(p)c

, y

c

}using incremental SVM (e.g., Cauwenberghs &Poggio, 2001). This step requires {↵

i

, b}.5: Compute @L

@u

on Dval according to Eq. (10).6: Set u to a unit vector aligned with @L

@u

.

7: k k + 1 and x

(p)c

x

(p�1)c

+ tu

8: until L⇣x

(p)c

⌘� L

⇣x

(p�1)c

⌘< ✏

9: return: x

c

= x

(p)c

be used as a starting point. However, if this point istoo close to the boundary of the attacking class, theiteratively adjusted attack point may become a reservepoint, which halts further progress.

The computation of the gradient of the validation errorcrucially depends on the assumption that the structureof the sets S, E and R does not change during the up-date. In general, it is di�cult to determine the largeststep t along an arbitrary direction u, which preservesthis structure. The classical line search strategy usedin gradient ascent methods is not suitable for our case,since the update to the optimal solution for large stepsmay be prohibitively expensive. Hence, the step t isfixed to a small constant value in our algorithm. After

each update of the attack point x(p)c

, the optimal solu-tion is e�ciently recomputed from the solution on Dtr,using the incremental SVM machinery (e.g., Cauwen-berghs & Poggio, 2001).

The algorithm terminates when the change in the vali-dation error is smaller than a predefined threshold. Forkernels including the linear kernel, the surface of thevalidation error is unbounded, hence the algorithm ishalted when the attack vector deviates too much fromthe training data; i.e., we bound the size of our attackpoints.

3. Experiments

The experimental evaluation presented in the follow-ing sections demonstrates the behavior of our pro-posed method on an artificial two-dimensional datasetand evaluates its e↵ectiveness on the classical MNISThandwritten digit recognition dataset.

3.1. Artificial data

We first consider a two-dimensional data generationmodel in which each class follows a Gaussian distri-bution with mean and covariance matrices given byµ� = [�1.5, 0], µ+ = [1.5, 0], ⌃� = ⌃+ = 0.6I.The points from the negative distribution are assignedthe label �1 (shown as red in the subsequent figures)and otherwise +1 (shown as blue). The training andthe validation sets, Dtr and Dval (consisting of 25 and500 points per class, respectively) are randomly drawnfrom this distribution.

In the experiment presented below, the red class is theattacking class. To this end, a random point of theblue class is selected and its label is flipped to serveas the starting point for our method. Our gradientascent method is then used to refine this attack un-til its termination condition is satisfied. The attack’strajectory is traced as the black line in Fig. 1 for boththe linear kernel (upper two plots) and the RBF ker-nel (lower two plots). The background in each plotrepresents the error surface explicitly computed for allpoints within the box x 2 [�5, 5]2. The leftmost plotsin each pair show the hinge loss computed on a vali-dation set while the rightmost plots in each pair showthe classification error for the area of interest. For thelinear kernel, the range of attack points is limited tothe box x 2 [�4, 4]2 shown as a dashed line.

For both kernels, these plots show that our gradientascent algorithm finds a reasonably good local maxi-mum of the non-convex error surface. For the linearkernel, it terminates at the corner of the bounded re-gion, since the error surface is unbounded. For theRBF kernel, it also finds a good local maximum of thehinge loss which, incidentally, is the maximum classi-fication error within this area of interest.

3.2. Real data

We now quantitatively validate the e↵ectiveness ofthe proposed attack strategy on a well-known MNISThandwritten digit classification task (LeCun et al.,1995). Similarly to Globerson & Roweis (2006), wefocus on two-class sub-problems of discriminating be-tween two distinct digits.1 In particular, we considerthe following two-class problems: 7 vs. 1; 9 vs. 8; 4vs. 0. The visual nature of the handwritten digit dataprovides us with a semantic meaning for an attack.

Each digit in the MNIST data set is properly normal-ized and represented as a grayscale image of 28 ⇥ 28pixels. In particular, each pixel is ordered in a raster-

1The data set is also publicly available in Matlab formatat http://cs.nyu.edu/

~

roweis/data.html.

SVMの更新

勾配を算出

悪性データの更新

初期点は，既存データのラベルをflipして作成

from [Biggio+ 12]


SVMの更新• Incremental SVM [Cauwenberghs+ NIPS00]

• １つずつデータを追加しながらSVMを学習• 全データの役割が不変な範囲の最適化問題を反復的に解く• Reserve Point / Support Vector / Error Vector

C

W

αiC

W W

αi=Cαi=0

gi=0gi>0 gi<0

xi xi xi

support vector error vector

Figure 1: Soft-margin classification SVM training.

coefficients are obtained by minimizing a convex quadratic objective function underconstraints [12]

(1)

with Lagrange multiplier (and offset) , and with symmetric positive definite kernel matrix. The first-order conditions on reduce to the Kuhn-Tucker (KT)

conditions:

(2)

(3)

which partition the training data and corresponding coefficients , , inthree categories as illustrated in Figure 1 [9]: the set of margin support vectors strictlyon the margin ( ), the set of error support vectors exceeding the margin (notnecessarily misclassified), and the remaining set of (ignored) vectors within the margin.

2.2 Adiabatic incrementsThe margin vector coefficients change value during each incremental step to keep all el-ements in in equilibrium, i.e., keep their KT conditions satisfied. In particular, the KTconditions are expressed differentially as:

(4)

(5)

where is the coefficient being incremented, initially zero, of a “candidate” vector outside. Since for the margin vector working set , the changes in

coefficients must satisfy

......

(6)

with symmetric but not positive-definite Jacobian :

......

. . ....

(7)

from [Cauwenberghs+ NIPS00]

• 条件を破らないと収束しない場合，データの役割を変更

• 各最適化時には，サポートベクターのパラメータのみが更新

• データが追加されるたびに，全パラメータが収束するまで最適化すれば，SVMの最適解に収束


最適化問題の勾配計算• Incremental SVMのアイデアを用いる• 更新時に各データの役割が変動しない仮定を置く• サポートベクターのみに着目すれば良い• 更新式はカーネル関数に依存• 厳密な計算には，条件を破らないステップ幅の導出が必要• 本研究では定数ステップ幅で値を更新，計算をサボる

Poisoning Attacks against SVMs

Eq. (2) with respect to u using the product rule:

@g

k

@u

= Q

ks

@↵

@u

+@Q

kc

@u

↵

c

+ y

k

@b

@u

, (3)

where

@↵

@u

=

2

64

@↵1@u1

· · · @↵1@ud

.... . .

...@↵s@u1

· · · @↵s@ud

3

75 , simil.@Q

kc

@u

,

@b

@u

.

The expressions for the gradient can be further re-fined using the fact that the step taken in directionu should maintain the optimal SVM solution. Thiscan expressed as an adiabatic update condition usingthe technique introduced in (Cauwenberghs & Poggio,2001). Observe that for the i-th point in the training

set, the KKT conditions for the optimal solution of theSVM training problem can be expressed as:

g

i

=X

j2Dtr

Q

ij

↵

j

+ y

i

b� 1

8><

>:

> 0; i 2 R= 0; i 2 S< 0; i 2 E

(4)

h =X

j2Dtr

y

j

↵

j

= 0 . (5)

The equality in condition (4) and (5) implies that aninfinitesimal change in the attack point x

c

causes asmooth change in the optimal solution of the SVM,under the restriction that the composition of the setsS, E and R remain intact. This equilibrium allowsus to predict the response of the SVM solution to thevariation of x

c

, as shown below.

By di↵erentiation of the x

c

-dependent terms in Eqs.(4)–(5) with respect to each component u

l

(1 l d),we obtain, for any i 2 S,

@g

@u

l

= Q

ss

@↵

@u

l

+@Q

sc

@u

l

↵

c

+ y

s

@b

@u

l

= 0

@h

@u

l

= y

>s

@↵

@u

l

= 0 ,

(6)

which can be rewritten as

@b

@ul@↵

@ul

�= �

0 y

>S

y

s

Q

ss

��1 0@Qsc

@ul

�↵

c

.

(7)

The first matrix can be inverted using the Sherman-Morrison-Woodbury formula (Lutkepohl, 1996):

0 y

>s

y

s

Q

ss

��1

=1

⇣

�1 �

>

� Q

�1ss

� ��

>

�(8)

where � = Q

�1ss

y

s

and ⇣ = y

>s

Q

�1ss

y

s

. Substituting(8) into (7) and observing that all components of the

inverted matrix are independent of xc

, we obtain:

@↵

@u

= �1

⇣

↵

c

(Q�1ss

� ��

>) · @Qsc

@u

@b

@u

= �1

⇣

↵

c

�

> · @Qsc

@u

.

(9)

Substituting (9) into (3) and further into (1), we obtainthe desired gradient used for optimizing our attack:

@L

@u

=mX

k=1

⇢M

k

@Q

sc

@u

+@Q

kc

@u

�↵

c

, (10)

where

M

k

= �1

⇣

(Qks

(Q�1ss

� ��

T ) + y

k

�

T ).

2.2. Kernelization

From Eq. (10), we see that the gradient of the objec-tive function at iteration k may depend on the attack

point x(p)c

= x

(p�1)c

+ tu only through the gradients ofthe matrix Q. In particular, this depends on the cho-sen kernel. We report below the expressions of thesegradients for three common kernels.

• Linear kernel:

@K

ic

@u

=@(x

i

· x(p)c

)

@u

= tx

i

• Polynomial kernel:

@K

ic

@u

=@(x

i

· x(p)c

+R)d

@u

= d(xi

·x(p)c

+R)d�1tx

i

• RBF kernel:

@K

ic

@u

=@e

� �2 ||xi�xc||2

@u

= K(xi

, x

(p)c

)�t(xi

� x

(p)c

)

The dependence on x

(p)c

(and, thus, on u) in the gra-dients of non-linear kernels can be avoided by substi-

tuting x

(p)c

with x

(p�1)c

, provided that t is su�cientlysmall. This approximation enables a straightforwardextension of our method to arbitrary kernels.

2.3. Poisoning Attack Algorithm

The algorithmic details of the method described inSection 2.1 are presented in Algorithm 1.

In this algorithm, the attack vector x

(0)c

is initializedby cloning an arbitrary point from the attacked classand flipping its label. In principle, any point su�-

ciently deep within the attacking class’s margin can

from [Biggio+ 12]


実験


人工データ実験Poisoning Attacks against SVMs

mean !i "

i (hinge loss)

!5 0 5!5

0

5

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

classification error

!5 0 5!5

0

5

0.01

0.02

0.03

0.04

0.05

0.06

mean !i "

i (hinge loss)

!5 0 5!5

0

5

0.11

0.115

0.12

0.125

0.13

0.135

0.14

0.145


!5 0 5!5

0

5

0.02

0.025

0.03

0.035

Figure 1. Behavior of the gradient-based attack strategy on the Gaussian data sets, for the linear (top row) and the RBFkernel (bottom row) with � = 0.5. The regularization parameter C was set to 1 in both cases. The solid black line

represents the gradual shift of the attack point x(p)c toward a local maximum. The hinge loss and the classification error

are shown in colors, to appreciate that the hinge loss provides a good approximation of the classification error. The valueof such functions for each point x 2 [�5, 5]2 is computed by learning an SVM on Dtr [ {x, y = �1} and evaluating itsperformance on Dval. The SVM solution on the clean data Dtr, and the training data itself, are reported for completeness,highlighting the support vectors (with black circles), the decision hyperplane and the margin bounds (with black lines).

scan and its value is directly considered as a feature.The overall number of features is d = 28 ⇥ 28 = 784.We normalized each feature (pixel value) x 2 [0, 1]d bydividing its value by 255.

In this experiment only the linear kernel is considered,and the regularization parameter of the SVM is fixedto C = 1. We randomly sample a training and a vali-dation data of 100 and 500 samples, respectively, andretain the complete testing data given by MNIST forDts. Although it varies for each digit, the size of thetesting data is about 2000 samples per class (digit).

The results of the experiment are presented in Fig. 2.The leftmost plots of each row show the example ofthe attacked class taken as starting points in our algo-rithm. The middle plots show the final attack point.The rightmost plots displays the increase in the vali-dation and testing errors as the attack progresses.

The visual appearance of the attack point reveals thatthe attack blurs the initial prototype toward the ap-pearance of examples of the attacking class. Compar-ing the initial and final attack points, we see this e↵ect:

the bottom segment of the 7 straightens to resemblea 1, the lower segment of the 9 becomes more roundthus mimicking an 8, and round noise is added to theouter boundary of the 4 to make it similar to a 0.

The increase in error over the course of attack is es-pecially striking, as shown in the rightmost plots. Ingeneral, the validation error overestimates the classifi-cation error due to a smaller sample size. Nonetheless,in the exemplary runs reported in this experiment, asingle attack data point caused the classification errorto rise from the initial error rates of 2–5% to 15–20%.Since our initial attack point is obtained by flippingthe label of a point in the attacked class, the errorsin the first iteration of the rightmost plots of Fig. 2are caused by single random label flips. This confirmsthat our attack can achieve significantly higher errorrates than random label flips, and underscores the vul-nerability of the SVM to poisoning attacks.

The latter point is further illustrated in a multiplepoint, multiple run experiment presented in Fig. 3.For this experiment, the attack was extended by in-

線形カーネル

RBFカーネル

from [Biggio+ 12]1612年7月28日土曜日

手書き文字認識実験設定

実験データ MNIST( 7 vs. 1; 9 vs. 8; 4 vs. 0)

SVM 線形カーネルC=1

training set 100

validation set 500


手書き文字認識実験結果 (7 vs. 1)


Before attack (7 vs 1)

5 10 15 20 25

5

10

15

20

25

After attack (7 vs 1)

5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4

number of iterations


validation error

testing error


5 10 15 20 25

5

10

15

20

25


5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4



validation error

testing error


5 10 15 20 25

5

10

15

20

25


5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4



validation error

testing error

Figure 2. Modifications to the initial (mislabeled) attack point performed by the proposed attack strategy, for the threeconsidered two-class problems from the MNIST data set. The increase in validation and testing errors across di↵erentiterations is also reported.

jecting additional points into the same class and av-eraging results over multiple runs on randomly cho-sen training and validation sets of the same size (100and 500 samples, respectively). One can clearly see asteady growth of the attack e↵ectiveness with the in-creasing percentage of the attack points in the trainingset. The variance of the error is quite high, which canbe explained by relatively small sizes of the trainingand validation data sets.

4. Conclusions and Future Work

The poisoning attack presented in this paper is thefirst step toward the security analysis of SVM againsttraining data attacks. Although our gradient ascentmethod is arguably a crude algorithmic procedure,

it attains a surprisingly large impact on the SVM’sempirical classification accuracy. The presented at-tack method also reveals the possibility for assessingthe impact of transformations carried out in the inputspace on the functions defined in the Reproducing Ker-nel Hilbert Spaces by means of di↵erential operators.Compared to previous work on evasion of learning al-gorithms (e.g., Bruckner & Sche↵er, 2009; Kloft &Laskov, 2010), such influence may facilitate the prac-tical realization of various evasion strategies. Theseimplications need to be further investigated.

Several potential improvements to the presentedmethod remain to be explored in future work. Thefirst would be to address our optimization method’srestriction to small changes in order to maintain theSVM’s structural constraints. We solved this by tak-

ラベルは1 from [Biggio+ 12]





5 10 15 20 25

5

10

15

20

25


5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4



validation error

testing error


5 10 15 20 25

5

10

15

20

25


5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4



validation error

testing error


5 10 15 20 25

5

10

15

20

25


5 10 15 20 25

5

10

15

20

25

0 200 4000

0.1

0.2

0.3

0.4



validation error

testing error









実験結果から• 悪性データがラベルクラスの性質を取り込んだデータに変化• 7の下部が1の横棒に似た形になるなど• 1データで，15-20%のエラー率向上を達成• 初期点では2-5%• その後のデータ改良にて，上記のエラー率を達成• アルゴリズムの有効性を示した形に• training dataの数を増やした時は，もっと性能は悪いでしょうが…


複数データ実験• 悪性データを一個ずつ追加で入れていった場合の性能推移• 決定境界に近いデータを初期点に置くと，悪性データがreserve pointに陥るため，そこで更新がストップPoisoning Attacks against SVMs

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

% of attack points in training data

classification error (7 vs 1)

validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

Figure 3. Results of the multi-point, multi-run experimentson the MNIST data set. In each plot, we show the clas-sification errors due to poisoning as a function of the per-centage of training contamination for both the validation(red solid line) and testing sets (black dashed line). Thetopmost plot is for the 7 vs.1 classifier, the middle is forthe 9 vs. 8 classifier, and the bottommost is for the 4 vs.0 classifier.

ing many tiny gradient steps. It would be interestingto investigate a more accurate and e�cient computa-tion of the largest possible step that does not alter thestructure of the optimal solution.

Another direction for research is the simultaneous opti-mization of multi-point attacks, which we successfullyapproached with sequential single-point attacks. Thefirst question is how to optimally perturb a subset ofthe training data; that is, instead of individually opti-mizing each attack point, one could derive simultane-ous steps for every attack point to better optimize theiroverall e↵ect. The second question is how to choosethe best subset of points to use as a starting pointfor the attack. Generally, the latter is a subset selec-tion problem but heuristics may allow for improved ap-proximations. Regardless, we demonstrate that evennon-optimal multi-point attack strategies significantlydegrade the SVM’s performance.

An important practical limitation of the proposedmethod is the assumption that the attacker controlsthe labels of the injected points. Such assumptionsmay not hold when the labels are only assigned bytrusted sources such as humans. For instance, a spamfilter uses its users’ labeling of messages as its groundtruth. Thus, although an attacker can send arbitrarymessages, he cannot guarantee that they will have thelabels necessary for his attack. This imposes an ad-ditional requirement that the attack data must satisfycertain side constraints to fool the labeling oracle. Fur-ther work is needed to understand these potential sideconstraints and to incorporate them into attacks.

The final extension would be to incorporate the real-world inverse feature-mapping problem; that is, theproblem of finding real-world attack data that canachieve the desired result in the learner’s input space.For data like handwritten digits, there is a direct map-ping between the real-world image data and the inputfeatures used for learning. In many other problems(e.g., spam filtering) the mapping is more complex andmay involve various non-smooth operations and nor-malizations. Solving these inverse mapping problemsfor attacks against learning remains open.

Acknowledgments

This work was supported by a grant awarded to B. Big-gio by Regione Autonoma della Sardegna, and bythe project No. CRP-18293 funded by the same in-stitution, PO Sardegna FSE 2007-2013, L.R. 7/2007“Promotion of the scientific research and technolog-ical innovation in Sardinia”. The authors also wishto acknowledge the Alexander von Humboldt Founda-


0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error






Acknowledgments



0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error

0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4



validation error

testing error






Acknowledgments


from [Biggio+ 12]


まとめ• SVMに毒を盛る！• SVMの精度をガタ落ちさせるデータを新しく注入• 性能を落とすデータの作り方を提案• 最適化問題は非凸だが，勾配法で解く形に• カーネルSVMにも適用可能• 手書き文字認識タスクで実験• たった1データで精度を2割前後落とす事に成功


Future Work• より効率的，頑健，高速な最適化手法

• カーネル毎でのPoisoning Attackへの耐性評価

• 複数の悪性データを同時に注入可能なケース

• データのラベルを攻撃者が固定できないケース

• 設計者の人力でラベル付けされる場合等

• ラベル付を誘導するため，入力データに制約が必要

• 現実的には，入力データの人工生成は困難

• 入力ベクトルがbag-of-wordsの場合，それっぽいテキストに変換可能な結果を返す必要がある


poisoningattacksvm (icmlreading2012)

Technology