Download - 2004 年 11 月 24 日(水)~ 26 日(金)
![Page 1: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/1.jpg)
1
2004 年 11 月 24 日(水)~ 26 日(金)
2004 Open Lecture at ISM Recent topics in machine learning: Boosting
公開講座 統計数理要論「機械学習の最近の話題」
ブースト学習
江口 真透
(統計数理研究所 , 総合研究大学院統計科学)
![Page 2: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/2.jpg)
2
講座内容
ブースト学習 :
統計的パタン認識の手法であるアダブーストを概説し、その長所と欠点について考察します.遺伝子発現、リモートセンシング・データなどの適用例の紹介をします。
![Page 3: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/3.jpg)
3
Boost Leaning (I)
10:00-12:30 November 25 Thu
Boost learning algorithm
AdaBoost
EtaBoost
GroupBoost
AsymAdaBoost AsymLearning
Robust learning
Group learning
![Page 4: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/4.jpg)
4
Boost Leaning (II)
13:30-16:00 November 26 Fri
Statistical discussion
Optimal classifier by AdaBoost
Probablistic framework
Bayes Rule, Fisher’s LDF, Logistic regression
BridgeBoost
LocalBoost
Meta learning
Local learning
![Page 5: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/5.jpg)
5
謝辞
この講座で紹介された内容の多くは,以下の 共同研究者の方々との成果を含む.ここに感謝
する. 村田 昇氏(早稲田大学理工学) 西井龍映氏(九州大学数理学) 金森敬文氏(東京工大,情報数理) 竹之内高志氏(統計数理研究所) 川喜田正則君(総研大,統計科学) John B. Copas (Dept Stats, Univ of Warwick)
![Page 6: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/6.jpg)
6
The strength of the weak learnability.
Strong learnability If, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances.
Weak learnability The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing.
Schapire, R. (1990)
![Page 7: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/7.jpg)
7
Web-page on Boost
Boosting Research Site: http: //www.boosting.org/
Robert Schapire’s home page:http://www.cs.princeton.edu/~schapire/
Yoav Freund's home page :http://www1.cs.columbia.edu/~freund/
John Laffertyhttp://www-2.cs.cmu.edu/~lafferty/
![Page 8: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/8.jpg)
8
Character, Image, Speaker, Signal, Face, Language,…
Recognition for
Statistical pattern recognition
Prediction for
Weather, earthquake, disaster, finance, interest rates, company
bankruptcy, credit, default, infection, disease, adverse effect
Classification for
Species, parentage, genomic type, gene expression,
protein expression, system failure, machine trouble
![Page 9: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/9.jpg)
9
Multi-class classification
Feature vector Class label
Discriminant function
Classification rule
![Page 10: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/10.jpg)
10
Binary classification
Classification rule
Learn a training dataset
Make a classification
label
0-normalization
![Page 11: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/11.jpg)
11
Statistical learning theory
Boost learning
Boost by filter (Schapire, 1990)
Bagging, Arching ( bootstrap ) (Breiman, Friedman, Hasite)
AdaBoost (Schapire, Freund, Batrlett, Lee)
Support vector
Maximize margin Kernel space
(Vapnik, Sholkopf)
![Page 12: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/12.jpg)
12
Class of weak machines
Stamp class
Linear class
ANN class SVM class kNN class
Point: colorful character rather than universal character
![Page 13: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/13.jpg)
13
AdaBoost
0)(),1()(: settings Initial.1 01
1 xFniiw
n
,)'(
)())((I)(
iw
iwfyf
t
tiit x
)(
)(1
21
)(
)(log)b(tt
ttt
f
f
T
tttTT fFF
1)( )()( where,)(sign.3 )( xxx
Tt ,,1For .2
))(exp()()()c( )(1 iitttt yfiwiw x
)(min)()a( )( ff tf
tt F
![Page 14: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/14.jpg)
14
Learning algorithm
Data
)(,),1( 11 nww
)(,),1( 22 nww 1
2
1
2
T
)()1( xf
T
1)( )(
ttt f x
)(,),1( nww TT
)()2( xf
)()( xTf
Final machine
T
tttTT fFF
1)( )()(where,)(sign )( xxx
![Page 15: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/15.jpg)
15
Simulation (complete separation)
-1 -0.5 0.5 1
-1
-0.5
0.5
1
[-1,1]×[-1,1]
Feature space
Decision boundary
![Page 16: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/16.jpg)
16
-1 -0.5 0.5 1
-1
-0.5
0.5
1
Set of weak machines
Linear classification machines
Random generation
![Page 17: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/17.jpg)
17
Learning process (I)
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Iter = 1, train err = 0.21 Iter = 13, train err = 0.18 Iter = 17, train err = 0.10
Iter = 23, train err = 0.10 Iter = 31, train err = 0.095 Iter = 47, train err = 0.08
![Page 18: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/18.jpg)
18
Learning process (II)
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Iter = 55, train err = 0.061
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Iter = 99, train err = 0.032
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Iter = 155, train err = 0.016
![Page 19: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/19.jpg)
19
Final stage
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Contour of F(x) Sign(F(x))
![Page 20: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/20.jpg)
20
Learning curve
50 100 150 200 250
0.05
0.1
0.15
0.2
Iter = 1,…..,277
Train
ing
error
![Page 21: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/21.jpg)
21
Characteristics
2
1)( )(1 tt f ( least favorable )
)()( 1 iwiw tt
t
t
eyf
eyf
iit
iit
ctormultiplifa)(
ctor multiplifa)(
)(
)(
x
x
Update
Weighted error rates )()()( )1(1)(1)( tttttt fff
![Page 22: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/22.jpg)
22
2
1)( )(1 tt f
)'(
)())(()(
1
1
1)(1 iw
iwyfIf
t
tn
iiittt x
n
ititit
n
itititiit
iwfy
iwfyyfI
1)(
1)()(
)()}(exp{
)()}(exp{))((
x
xx
n
itiitt
n
itiitt
n
itiitt
iwyfIiwyfI
iwyfI
1)(
1)(
1)(
)())((}exp{)())((}exp{
)())((}exp{
xx
x
2
1
)}(1{)(1
)()(
)(
)(1
)()(
)(1
)()(
)()(
)(
)(
)()(
)(
tttt
tttt
tt
tt
tttt
tt
ff
ff
f
f
ff
f
![Page 23: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/23.jpg)
23
Exponential loss
)}(exp{1
)(1
exp ii
n
i
Fyn
FL x
)()()( xxx fFF
}))(1()(){(exp fefeFL
Update by
Exponential loss :
n
iiiiiii yfeyfeFy
n 1
))(I())(I()}(exp{1
xxx
n
iiiii fyFy
nfFL
1exp )}(exp{)}(exp{
1)( xx
![Page 24: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/24.jpg)
24
Sequential minimization
}))(1()(){()( expexp efefFLfFL
)(
)(1log
2
1opt f
f
)(
)}(exp{))(()(
exp
1
FL
FyyfIf
n
iiiij
xxwhere
)}(1){(2 ff
)}(1){(2)()(1
2
ffefe
f
Equality holds iff
efef ))(1()(
![Page 25: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/25.jpg)
25
AdaBoost = minimum exp-loss
)(minarg)a( )( ff tFf
t
)(
)(1log
2
1 (b)
)(
)(
tt
ttt f
f
)}(exp{)()((c) *1 itittt xfyiwiw
)}(1){()()(min )()(1exp)(1exp ttttt ffFLfFL
R
)(
)(1log
2
1
)(
)(opt
t
t
f
f
![Page 26: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/26.jpg)
26
Simulation (complete random)
-1 -0.5 0.5 1
-1
-0.5
0.5
1
![Page 27: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/27.jpg)
27
Overlearning of AdaBoost
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Iter = 51, train err = 0.21
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Iter = 151, train err = 0.06 Iter =301, train err = 0.0
![Page 28: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/28.jpg)
28
Drawbacks of AdaBoost
1. Unbalancing learning
2. Over-learning even for noisy dataset
EtaBoost
GroupBoost
BridgeBoost
AsymAdaBoost
LocalBoost
Robustfy mislabelled examples
Relax the p >> n problems
Extract spatial information
Combine different datasets
Balancing the false n/ps’
![Page 29: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/29.jpg)
29
AsymBoost
)(minarg 1**
ttasymt FfL
R
The small modification of AdaBoost into
2 (b)’
The selection of k
The default choice is
n
ii
n
ii
yI
yIk
1
1
)1(
)1(
![Page 30: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/30.jpg)
30
Weighted errors by k
![Page 31: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/31.jpg)
31
![Page 32: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/32.jpg)
32
![Page 33: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/33.jpg)
33
![Page 34: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/34.jpg)
34
Result of AsymBoost
![Page 35: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/35.jpg)
35
Eta-loss function
regularized
![Page 36: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/36.jpg)
36
EtaBoost
0)(),1()(: settings Initial.1 01
1 xFniiw
n
,)())((I)(1
iwfyf m
n
iiim
x
T
tttTT fFF
1)( )()( where,)(sign.3 )( xxx
Tm ,,1For .2
))(exp()()()c( )(*
1 iimmmm yfiwiw x
)(min)()a( )( ff mf
mm
(b)
![Page 37: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/37.jpg)
37
A toy example
![Page 38: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/38.jpg)
38
AdaBoost vs Eta-Boost
![Page 39: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/39.jpg)
39
Simulation (complete random)
-1 -0.5 0.5 1
-1
-0.5
0.5
1
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Iter = 51, train err =
0.21
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Iter =301, train err =
0.0Overlearning of AdaBoost
![Page 40: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/40.jpg)
40
EtaBoost
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Iter = 51, train err = 0.25 Iter = 51, train err = 0.15 Iter =351, train err = 0.18
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
![Page 41: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/41.jpg)
41
Mis-labeled examples
Mis-labeled
![Page 42: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/42.jpg)
42
Comparison
AdaBoost EtaBoost
![Page 43: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/43.jpg)
43
GroupBoost
Relax over-learning of AdaBoost by group learning
Idea: In AdaBoost 2 (a)
The best machine is singly selected
Other better machines are cast off.
Is there any wise way of grouping G best macines?
![Page 44: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/44.jpg)
44
Grouping machines
where),(minarg);(Let fbf tjjtjFf
G
gggtgtt bxf
Gf
1)()()(, );(
1)( x
}1,1{,RI:)sgn( abbxaF jj
);(,),;(setaSelect )()()1()1( ggtt bfbf
));(());((thatsuch )()()1()1( ggtttt bfbf
![Page 45: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/45.jpg)
45
GroupBoost
)())((I)()a(1
iwyff t
N
iiit
x
11
))(exp()()((c)
t
ittt Z
yfiwiw
x
);(,),;( )()()1()1( GGtt bxfbxf
).,,1());((
));((1log
2
1)b(
)(
)(),( Gg
bf
bf
gtt
gttgt
.);(1
)(1
)()(),(
G
gggtgt bxf
Gf x
![Page 46: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/46.jpg)
46
0F
11 f
22 f
33 f
44 f
1f
2f
3f
4f
Grouping jumps for the next
![Page 47: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/47.jpg)
47
Learning archtecture
Data
)(,),1( 11 nww
)(,),1( 22 nww
1
2
T
),1()1,1(
),1()1,1(
,,
)(,),(
G
Gff
xx
T
ttf
1)( )(x
)(,),1( nww TT
Grouping G machines )( ),(),()1,()1,(1
)( GtGtttGt fff
)()1( xf
)()2( xf
)()( xTf
),2()1,2(
),2()1,2(
,,
)(,),(
G
Gff
xx
),()1,(
),()1,(
,,
)(,),(
GTT
GTT ff
xx
![Page 48: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/48.jpg)
48
AdaBoost and GroupBoost
ostGroupAdaBo・
AdaBoost・ )()( exp11exp tttt FLfFL
)()()( exp1exp11exp tttttt FLfFLfFL
)exp(:AdaBoost t
)exp(:ostGroupAdaBo1
,
G
ggt
Update the weights
![Page 49: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/49.jpg)
49
From microarray
Contest program from bioinformatics
(BIP2003)
http://contest.genome.ad.jp/
Microarray data
Number of genes p = 1000 ~ 100000
Size of individuals n = 10 ~ 100
![Page 50: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/50.jpg)
50
Output
http://genome-www.stanford.edu/cellcycle/
![Page 51: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/51.jpg)
51
Goal
Disease and gene expressions
y is a label for clinical infomation
x = (x , …, x ) = feature vector1 p
i x = log
The genen expression for i-th individual is observed
n = 27+11 = 38 << p = 7109(BIP2003) Problem2 :
![Page 52: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/52.jpg)
52
Microarray data
cDNA microarry
![Page 53: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/53.jpg)
53
Prediction from gene expressions
),( 1 pxx x
}1,1{ y
yf x:
Feature vector dimension = number of genes p components = quantities of gene expression
Class label disease, adverse effect
Classification machine
based on training dataset }1:),({ niyii x
![Page 54: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/54.jpg)
54
Leukemic diseases, Golub et alhttp://www.broad.mit.edu/cgi-bin/cancer/publications/
![Page 55: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/55.jpg)
55
AdaBoost
0)(),1()(: settings Initial.1 01
1 xFniiw
n
,)())((I)(1
iwfyf t
n
iiit
x
)(
)(1
21
)(
)(log)b(tt
ttt
f
f
T
tttTT fFF
1)( )()( where,)(sign.3 )( xxx
Tt ,,1For .2
))(exp()()()c( )(1 iitttt yfiwiw x
)(min)()a( )( ff tf
tt
![Page 56: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/56.jpg)
56
One-gene classifier
Error number 5 5 5 6 466 5 565
})(sgn({minarg i
ijij bxyIbb
one-gene classifier
jj
jjj bx
bxf
if1
if1)(x
jb
Let be expressions of the j-th genenjj xx ,...,1
jx
![Page 57: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/57.jpg)
57
The second training
Errror number 45.5 7 9
87.56
798.5
Update the weight:
jx
4.5
jb
Weight up to 2
})(sgn()({minarg i
ijij bxyIiwbb
2log4
16log5.0
ans. false of nb.
ans.correct of nb.log5.01
Weight down to 0.5
jb
![Page 58: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/58.jpg)
58
Web microarray data
p n y = +1 y = - 1
ALLAML 7129 72 37 35
Colon 2000 62 40 22
Estrogen 7129 49 25 24
p >> n
http://microarray.princeton.edu/oncology/http://mgm.duke.edu/genome/dna micro/work/
![Page 59: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/59.jpg)
59
10-fold validation
N
iii yD 1),( x
1 2 3 4 109
Validation Training
)1(
Validation Training
)2(
Validation Training
)3(
)(10
1byAveraing
10
1
kk
![Page 60: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/60.jpg)
60
ALLAML
Training error 10-fold CV error
![Page 61: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/61.jpg)
61
Colon
Training error 10-fold CV error
![Page 62: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/62.jpg)
62
Estrogen
Training error 10-fold CV error
![Page 63: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/63.jpg)
63
Gene score
T
t
G
gggtgt
T
tT bxffF
1 1)()()(,
1
);()()( xx・
gene th-forfactorConfidence j
T
t
G
ggtjgI
TjS
1 1),())((
1)(
S(j) suggests the degree of association with the class-label
![Page 64: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/64.jpg)
64
Genes associated with disease
Compare our result with the set of genes suggested
by Golub et al. , West et al.
AdaBoost can detect only 15 genes in
the case of ALLAML,.
GroupAdaBoostde detect 30 genes compatible with
Their results
![Page 65: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/65.jpg)
65
Boost Leaning (II)
13:30-16:00 November 26 Fri
Statistical discussion
Optimal classifier by AdaBoost
Probablistic framework
Bayes Rule, Fisher’s LDF, Logistic regression
BridgeBoost p >> n problem
LocalBoost contextual information
![Page 66: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/66.jpg)
66
Problem: p >> n
Fundamental issue on Bioinformatics
p is the dimension of biomarker
(SNPs, proteome, microarray, …)
n is the number of individuals
(informed consent, institutional protocol, …bioethics)
![Page 67: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/67.jpg)
67
An approach by combining
Let B be a biomarker space
Rapid expansion of genomic data
},,1:)({ kik niD z
pnKK
kk
1
larger
Let be K experimental facilitiesKII ,,1
),...,1( Kknp k
![Page 68: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/68.jpg)
68
BridgeBoost
1D
CAMDA (Critical Assessment of Microarray Data Analysis )
DDBJ (DNA Data Bank Japan, NIG)
2D
KD
)( 1Df
)( 2Df
)( KDf
…. ….
)|( 11 DDf
)|( 22 DDf
)|( KK DDf
…. result
![Page 69: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/69.jpg)
69
CAMDA 2003
4 datasets for Lung Cancer
Harvard PNAS, 2001 Affymetrix
Michigan Nature Med,
2002
Affymetrix
Stanford PNAS, 2001 cDNA
Ontario Cancer Res
2001
cDNA
http://www.camda.duke.edu/camda03/datasets/
![Page 70: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/70.jpg)
70
Some problems
1. Heterogeneity in feature space
cDNA, Affymetrix
Differences in covariates , medical diagnosis
Uncertainty for microarray experiments
2. Heterogeneous class-labeling
3. Heterogeneous generalization powers
A vast of unpublished studies
4. Publication bias
![Page 71: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/71.jpg)
71
Machine learning
Leanability: boosting weak learners?
AdaBoost : Freund & Schapire (1997)
weak classifiers
})(,....),({ 1 xx pff
A strong classifier
)()( )()1(1 xx tt ff
)(xf
)()1(1 xfstagewise
![Page 72: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/72.jpg)
72
Learning algorithm
D
)(,),1( 11 nww
)(,),1( 22 nww 1
2
1
2
T
)()1( xf
T
1)( )(
ttt f x
)(,),1( nww TT
)()2( xf
)()( xTf
Final machine
T
tttTT fFF
1)( )()(where,)(sign )( xxx
![Page 73: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/73.jpg)
73
Different datasets
}1:),({where )()(k
ki
kik niyD x
K
kkDD
1
Normalization: ]1,0[ RI )()( pki
ki
p xx
),...,1(minmax
min)()(
)()(
)( pjxx
xxx
kji
i
kji
i
kji
i
kjik
ji
∋
)(
)( RIk
i
pki
y
x expression vector of the same genes
label of the same clinical item
![Page 74: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/74.jpg)
74
Weighted Errors
))((I)(1
)()(
1
)(
)()(
)(
)(
k
k
n
i
ki
kin
i
k
t
k
tkt fyf
iw
iwx
K
k
kt
kt ff
1
)()( )()(
The k-th weighted error
The combined weighted error
K
h
n
i
h
t
n
i
k
tk
h
k
iw
iw
1 1
)(
1
)(
)(
)(
)(
where
![Page 75: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/75.jpg)
75
)(
)(121
)(
)(
log)b( )(k
k
tt
ttkt
f
f
))(exp()()()d( )()()()()(1
ki
kit
kt
kt
kt yfiwiw x
)(minarg(a) )()( ff kt
f
kt
BridgeBoost
K
k
kt
ktt f
Kf
1
)()( )(1
)()c( xx
Kkkk
t niiw 1)( }1:)({
![Page 76: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/76.jpg)
76
Learning flow
Stage t : )( )()()1()1(1)(
Kt
K
tttKt fff
)()1( xtf1D
KD
2D )()2( xtf
)()( xKtf
)1(t
)2(t
)(Kt
)}({ )1( iwt
)}({ )2( iwt
})({ )( iw Kt
D
)}({ )1(1 iwt
)}({ )2(1 iwt
})({ )(1 iw K
t
)()( xtf
Stage t+1 :
![Page 77: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/77.jpg)
77
Mean exponential loss
kn
i
ki
ki
kk Fy
nFL
1
)()( )}(exp{1
)( xExponential loss
)(minarglog 1)()(
)(
)()(
)(
)(121
tk
tk
tktt
kttk
t FfLf
f
K
kk FLFL
1
)()(Mean exponential loss
)()(1
)( 11
)()(11
t
K
k
kt
kttktt FLfFL
KfFL
Note: convexity of Expo-Loss
![Page 78: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/78.jpg)
78
Meta-leaning
validatory-crossis)( 1
)( t
kth FfLkh
kk
thh DfDL onis;onis)( )(
kh
tk
thtk
tk
K
ht
kth FfLFfLFfL )()()( 1
)(1
)(
11
)(
Separate learning Meta-learning
![Page 79: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/79.jpg)
79
Simulation
Collapsed dataset
Traning error Test error
3 datasets
},,{ 321 DDD
21 , DD
3D
Test error 0 ( ideal )
Test error 0.5 ( ideal )
50,50,50,100 321 nnnp
data 1, data2
data3
![Page 80: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/80.jpg)
80
Comparison
Separate AdaBoost BridgeBoost
Training error Training errorTest error Test error
![Page 81: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/81.jpg)
81
Min =15% Min =4%Min =43%Min = 3% Min = 4%
Collapsed AdaBoost Separate AdaBoost BridgeBoost
Test errors
![Page 82: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/82.jpg)
82
Conclusion
1D
2D
KD
)( 1Df
)( 2Df
)( KDf
…. ….
)|( 11 DDf
)|( 22 DDf
)|( KK DDf
….result
SeparateLeaning
Meta-leaning
![Page 83: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/83.jpg)
83
Unsolved problems in BridgeBoost
3. On the information on the unmatched genes in combining datasets
2. Prediction for class-label for a given new x ?
4. Heterogeneity is OK, but publication bias?
1. Which dataset should be joined or deleted in BridgeBoost ?
![Page 84: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/84.jpg)
84
Markov random field
Markov random field (MRF)
ICM algorithm (Besag, 1986)
![Page 85: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/85.jpg)
85
Neighbor sites
![Page 86: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/86.jpg)
86
Space AdaBoost
1. Estimate the posterior p(k | x) using only non-contextual information.
2. Extract the contextual information based on the estimated posterior p(k | x)
3. Make a hierarchical set of weak machines, and start the algorithm from the noncontextual result.
})|({ ikpF rr
![Page 87: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/87.jpg)
87
Analysis comparison
True Image2
1r
True labels
20r
20r
22r
24r
217r
![Page 88: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/88.jpg)
88
LocalBoost
)}(exp{))(,(1
),( )()(1
local ii
n
ih FyiK
nhFL ss xss
Local exponential loss :
Let S be the sets of possible sites and we obtain training dataset over S :
![Page 89: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/89.jpg)
89
Algorithm
,)())(()I(),(1
iwfy,Kf t
n
iiithtt
xsss
),(
),(1
21
)(
)(log)b(ttt
tttt
t
t
f
f
s
s
s
s
))(exp()()()c( )(1 iitttt yfiwiwt
xs
),(minarg)a( )( ttFf
t fft
ss
},...,1 :)({ from sampledunformly is niit ss
Note: the selected machine has local information around
ttf s)(
ts
![Page 90: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/90.jpg)
90
Predicion rule )),(sgn(),( sxsx Fh
T
tttth t
fKF1
)( )(),(),( xsssx s
Locally weited combine
We use the second weighting for strongly combining only classification machines with near s.ttf s)( ts
![Page 91: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/91.jpg)
91
LocalBoost vs. AdaBoost
![Page 92: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/92.jpg)
92
Statistical discussion
Bayes rule
Neyman-Pearson Lemma
Model-misspecification
ROC curve
![Page 93: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/93.jpg)
93
Error rate
Two types of errors ( misclassification probabilities )
False Negative
False Positive
u is a cut-off point .
True Negative
True Positive False Positive
False Negative
![Page 94: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/94.jpg)
94
Neyman-Pearson Lemma (1)
Null hypothesis Alternative
Log-likelihood raio
Neyman-Pearson lemma
![Page 95: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/95.jpg)
95
Neyman-Pearson Lemma (2)
Bayes rule
![Page 96: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/96.jpg)
96
Loss functions by NP lemma
where
![Page 97: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/97.jpg)
97
A class of loss functions
Exponential
Log loss
![Page 98: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/98.jpg)
98
Bayes rule equivalence
Theorem 1
Equality holds if and only if
![Page 99: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/99.jpg)
99
Error rate
Error rate
where H(s) は Heaviside function,
is of the class
In general
![Page 100: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/100.jpg)
100
Important examples
Credit scoring
Medical screening
![Page 101: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/101.jpg)
101
Minimum Exponential Loss
Empirical exp loss
Expected exp loss
Theorem.
![Page 102: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/102.jpg)
102
Variational discussion
![Page 103: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/103.jpg)
103
ROC curve
Gini index
A
Area Under the Curve
![Page 104: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/104.jpg)
104
ROC analysis
FP
TP
![Page 105: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/105.jpg)
105
Logistic type discriminant
TFor a given traning dataset
For a given function
suggests the decision rule
the empirical loss is
![Page 106: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/106.jpg)
106
Log loss
Conditionsal expected likelihood
where
![Page 107: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/107.jpg)
107
Estimation equation
IRLS (Iteratively reweighted least squares)
where
empirical
expected
logistic
glm(formula, family = binomial, data, weights =W, ・・・ )
![Page 108: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/108.jpg)
108
Fisher consistency
If the distribution of (x, y) isTheorem 3 .
is asymptotically consistent for
![Page 109: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/109.jpg)
109
Asymptotic efficiency
)()()(1
)ˆ(var 11A ββββ JVJn
Cramer-Rao type innequality gives
1A })()(}1({E1
)ˆ(var
Tppn
xfxfβ
Asymptotic variance
Equality holds if and only if
Or equivalently the logistic regression .
![Page 110: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/110.jpg)
110
Expected loss under parametrics
)}ˆ({I),ˆ(Risk ββ ULU
Under the parametric assumption
Expected D-loss
*),()1
(),ˆ(Risk),ˆ(Risk log* UUn
oUUU ββ
![Page 111: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/111.jpg)
111
Expected loss under misspecified model
)()()( 2
1
nOT xfβx
)'( argmin *'
* βββ
UU L
Under a near parametric settings
)o(
)}Hesse()ˆ(vartr{)(),ˆ(Risk1
*A
** 21
n
LDU UUUU βββ
where
Then for
![Page 112: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/112.jpg)
112
Sampling scheme
Mixture sampling
Separatesampling
Conditinal samling (Cohort study, prospective study)
(case-control study, retrospective study
![Page 113: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/113.jpg)
113
Adams, N.M. nd Hand, D.J. (1999). Comparing classifiers when the misclassification costs are uncertain. Pattern Recognition 32, 1139-1147.
Adams, N.M. and Hand, D.J. (2000). Improving the practice of classifier performance assessment. Neural Computation 12, 305-311.
Begg, C. B., Satogopan, J. M. and Berwick, M. (1998). A new strategy for evaluating the impact of epidemiologic risk factors for cancer with applications to melanoma.
J. Amer. Statist. Assoc. 93, 415-426.
Berwick, M, Begg, C. B., Fine, J. A., Roush, G. C. and Barnhill, R. L. (1996). Screening for cutaneous melanoma by self skin examination. J. National Cancer Inst., 88, 17-23.
Bishop, C. (1995),. Neural Networks for Pattern Recognition, Clarendon Press, Oxford.
![Page 114: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/114.jpg)
114
Domingo, C and Watanabe, O. (2000). MadaBoost: A modification of AdaBoost.In Proc. of the 13th Conference on Computational Learning Theory.
Efron, B. (1975), The efficiency of logistic regression compared to normal discriminant analysis. J. Amer. Statist. Asoc.70, 892-898.
Friedman, J., Hastie, T. and Tibishirani, R. (2000). Additive logistic regression: A statitistical view of boosting. Ann. Statist. 28, 337-407.
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179-188.
Hand, D. J. and Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: a review. J. Roy. Statist. Soc., A, 160, 523-541.
Ryuei Nishii and Shinto Eguchi (2004). Supervised Image Classification by ContextualAdaBoost Based on Posteriors in Neighborhoods To be submitted.
Hastie, T. Tibishirani, R. and Friedman J. (2001). The elements of statistical learning. Springer, New York.
![Page 115: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/115.jpg)
115
Lebanon, G. and Lafftry, J. (2001). Boosting and maximum likelihood for exponential models. NIPS, 14, 2001.
McLachlan, G. J. (1992). Discriminant analysis and statistical pattern recognition. Wiley, New York.
Pepe. M.S. and Thampson, M.L. (2000). Combing diagnostic test results to increase accuracy. Biostatistics 1, 123-140.
Ratsch, G., Onoda, T. and Muller K.-R. (2001) Soft Margins for AdaBoost. Machine Learning. 42(3)}, 287-320.
Schapire, R. (1990). The strength of the weak learnability. Machine Learning 5, 197-227.
Schapire, R. Freund, Y, Bartlett, P. and Lee, W. (1998). Boosting the margin: a new
explanation for effectiveness of voting methods. Ann. Statist., 26, 1651-1686.
Vapnik, V. N. (1999). The Nature of Statistical Learning Theory. Springer: New York.
![Page 116: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/116.jpg)
116
Eguchi, S. and Copas, J. (2001). Recent developments in discriminant analysis from an information geometric point of view. J. Korean Statist. Soc. 30, 247-264 (2001). (The special issue of the 30th aniversary of the Korean Statist. Soc)
Eguchi, S. and J. Copas . A class of logistic-type discriminant functions. Biometrika 89, 1-22 (2002).
Eguchi, S. (2002) U-boosting Method for Classification and Information Geometry. Invited talk on International Statistical Workshop, Statistical Research Center for Complex System, Seoul Natinal University
Kanamori, T. Takenouchi, S. Eguchi and N. Murata (2004). The most robust loss function for boosting. Lecture Notes in Computer Science 3316, 496-501, Springer.
Murata, N., T. Takenouchi, T. Kanamori and S. Eguchi , Information geometry of U-Boost and Bregman divergence. Neural Computation 16, 1437-1481 (2004).
![Page 117: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/117.jpg)
117
Takenouchi, T. and S. Eguchi. Robustifying AdaBoost by adding the naive error rate. Neural Computation 16, 767-787 (2004).
N. Murata, T. Takenouchi, T. Kanamori, S. Eguchi. Geometry of U-Boost algorithms. 林原フォーラム「偶然と必然-数理、情報、経済」分科会「情報の物理学」
江口真透 (2004). 情報幾何と統計的パタン認識 , 数学論説
江口真透 (2004). 統計的パタン識別の情報幾何 - U ブースト学習アルゴリズム- 数理科学 No. 489, 53-59
江口真透 (2002). 統計的識別の方法について - ロジスティック判別からアダブーストまで - . 応用統計学会 第 24 回 シンポジウムプログラム-多変量解析の新展開- 特別講演
![Page 118: 2004 年 11 月 24 日(水)~ 26 日(金)](https://reader033.vdocuments.pub/reader033/viewer/2022061514/568137d7550346895d9f773e/html5/thumbnails/118.jpg)
118
Future paradigm
The concept of learning will offer highly productive ideas for computational algorithms also in future.
Is it an imitation of biological brain?
Meta learning?
Human life?
And Statistics?
And Statistics?