Мезенцев Павел - Машинное обучение на mapreduce

Download Мезенцев Павел - Машинное обучение на MapReduce

If you can't read please download the document

Upload: pavel-mezentsev

Post on 22-May-2015

2.985 views

Category:

Technology


1 download

DESCRIPTION

Слайды с выступления Мезенцева Павла на Hadoop User Group Meetup, прошедшем в Москве 8 ноября 2012.

TRANSCRIPT

2. Apache Mahout MapReduce 3. ? , [Wikipedia] 4. 5. 6. 7. 8. MatlabOctaveRWeka....... 9. ? 10. Map Reduce ` 11. Apache Mahout 2008 300 . Mahout in Action 12. - K- Minhash ............. 13. MapReduce MapReduce for Machine Learning onMulticore (2008) 14. () (1)xx = ... y = {0, 1}(n)x :(1)(n) (1) (n) P( x ,.... , x y) P( yx ,.... , x )=P ( y)(1) (n)P(x ,.... , x ) :x (1) ,.... , x (n))=P ( y) i P(x (i) y )P( y 15. MapReduce 1 ( y) j 1( y j = y) P( y)==(all)j 1Map: ( y j , s ub ) (total , s ub ) ( y j , ) (total , )Reduce: 16. MapReduce 2(i)(i) ( x , y)P( x y)= ( y ) Map: i((x , y j ); s ub ) Reduce: ((x i , y j ); ) 17. K 18. MapReduce Map: ( x , y nearest ) ( y j ,(s ub x , N ))Map: ( y j , x)Reduce: 19. Tw x=0 20. 1P w ( x)=1+exp(wx) 21. yi 1 y iLw ( x)= i P w ( x i ) (1P w ( x i )) nl(w)=i=1 y i log p ( x i )+(1 y i )log(1 p ( x i ))w=argmax w l(w ) 22. - : l(w)1w=ww=wH w l(w) l (w) 2 2 ( ) () l(w) l(w) l(w)... w1 w1 w1 w1 wn w l(w )= ...H= ...... ... l(w) 2 l(w)2 l(w)... wn w n w1 wn wn 23. MapReduce l(w ) m =i=1 ( y i p w (x i )) x i(k) wkgrad s ub [k ]=i ( y pw ( x i )) x (k i )Map: (k , grad s ub [k ])Reduce: grad [k ]= grad s ub [k ] (k , grad [k ]) 24. MapReduce 2 l(w)m=i=1 p w (x i )( p w ( x i )1) x i x i( j) (k) wk w jMap: H s ub [k , j]=s ub .... ((k , j); H s ub [k , j])Reduce: H [k , j]=H s ub [k , j] ((k , j); H [k , j]) 25. MapReduce mn O(mn+nc)O(+nc log P)P mncO(mnc)O( +mn log P)K- P 2O(mn +n )3 m 2 n n3 2O(+ +n log P) P Pn m P , 26. Mahout 27. Mahout > mahout trainclassifier -i data -o model > mahout testclassifier -m model -d data 28. k mahout kmeans -i data-c clusters -o output-k clusters_num 29. mahout trainlogistic -hmahout runlogistic -h 30. Mahout 30 1 . . 300 . 3 31. [email protected]