jubatusにおける大規模分散オンライン機械学習

29
Jubatusにおける規模分散 オンライン機械学習 2011/12/08 @規模データ処勉強会 株式会社Preferred Infrastructure 海野 裕也 (@unnonouno)

Upload: preferred-infrastructure-preferred-networks

Post on 04-Dec-2014

25.654 views

Category:

Documents


4 download

DESCRIPTION

 

TRANSCRIPT

  • 1. Jubatus 2011/12/08 @Preferred Infrastructure (@unnonouno)

2. l (@unnonouno)l Preferred Infrastructure (PFI) l 20l Seduel l l l Jubatusl 2 3. Big Data !l ll l l l l PCEC 3 4. STEP 1. STEP 2. STEP 3. l30 30 4 5. l l 1)l 2)l 3)l (MapReducel l l l / (CEPl l 5 6. Jubatusl NTT PFPreferred Infrastructure10/27OSS http://jubat.us/ 6 7. l l l 7 8. l l l 8 9. (0, 1, 0, 2.5, -1, ) /SVM, LogReg, (1, 0.5, 0.1, -2, 3, ) PA, CW, ALOW, Nave Bayes (0, 1, 0, 1.5, 2, ) CNB, DT, RF, ANN, K-means, Spectral Clustering, MMC, LSI, LDA, GM, HMM, MRF, CRF, 9 10. l (0, 1, 0, 2.5, -1, ) /SVM, LogReg, (1, 0.5, 0.1, -2, 3, ) PA, CW, ALOW, Nave Bayes (0, 1, 0, 1.5, 2, ) CNB, DT, RF, ANN, K-means, Spectral Clustering, MMC, LSI, LDA, GM, HMM, MRF, CRF, 10 11. Jubatusl xy xy or or TwitterTweet 11 12. l l l Jubatus 2 1 1 IT 1 1 1 152 112 0 13. l l l 1: 2: 3: -12-11 -1-1 21 -111 1 2 -1 -1 13 14. l l 14 15. l l l l l l l Jubatus15 16. l l l l l l 16 17. l l 17 18. l l l l Jubatusl Perceptron (1958)l Passive Aggressive (PA) (2003) Confidence Weighted Learning (CW) (2008) ll AROW (2009)l Normal HERD (NHERD) (2010)18 19. l l 19 20. Jubatusl l l l l 21. Jubatusl l l 21 22. l UPDATEl l ANALYZEl l MIXl l cf. MAP / REDUCE 22 23. UPDATEl l 23 24. ANALYZE l l 24 25. MIX l l 25 26. l l (sum)(count)l UPDATEl sum += xl count += 1l ANALYZEl return (sum / count)l MIXl sum = sum1 + sum2l count = count1 + count226 27. l l l 100%l l l MIX 27 28. l l [Mann2009, McDonald2008]l l l l etc.l 28 29. l Jubatusl l http://jubat.usl Jubatusl l l UPDATE/ANALYZE/MIX29