tut_pfi_2012

71
規模データ解析のための 技術とその実化 株式会社 Preferred Infrastructure 岡野原 2012/06/12)@豊橋技科 特別講義

Upload: preferred-infrastructure-preferred-networks

Post on 04-Dec-2014

5.274 views

Category:

Documents


2 download

DESCRIPTION

豊橋技科大で2012/6/12で特別講義で利用した資料です。講義時の資料とは多少異なっています。

TRANSCRIPT

  • 1. 2012/06/12 Preferred Infrastructure
  • 2. l l Preferred Infrastructure l l l l l Jubatusl
  • 3. l l Preferred Infrastructure l l l l l Jubatus
  • 4. Preferred Infrastructure (PFIl l 20063l l 26+ 7l 2-40-1 l / l l NHKNIIBPl l l Jubatus
  • 5. PFI Bazil l l SPAMTwitterEC Sedue l l Web Jubatus l l SNSPOS l
  • 6. Sedue l BP l asahi.com l l NHK l XAPPYEC/Web l l Webcat Plus l l
  • 7. l l PFI l javascript, ash, rails, haskell, l
  • 8. PFIl l l IT l l l l l
  • 9. l l Preferred Infrastructure l l l l l Jubatus
  • 10.
  • 11. BigData ! l l 45%2020 4035ZB [Digital Universe 2010] l 3V Volume, Variety Velocity l l PC EC 11
  • 12. BigDatal BigDatal l Jim Gray(1) (2) (3) (4) l l l l
  • 13. l l Google, Amazon, Facebook l l l l l l 13
  • 14. 3STEPSTEP 1. STEP 2. STEP 3. lStep1 Step2 30 30 14
  • 15. l l l l l l l 15
  • 16. 1/2l NYproactive maintenance l 1500 l l 300realtime/semi-realtime/static l l l MTBF (Mean time between failures) l Machine Learning for the New York City Power Grid, J. IEEE Trans. PAMI, to appear, Con Edison 16
  • 17. 2/2l Reactive Maintenance l l l l l l l DB 17
  • 18. l l Preferred Infrastructure l l l l l Jubatus
  • 19.
  • 20. l l l 20
  • 21. (0, 1, 0, 2.5, -1, ) /SVM, LogReg, (1, 0.5, 0.1, -2, 3, ) PA, CW, ALOW, Nave Bayes (0, 1, 0, 1.5, 2, ) CNB, DT, RF, ANN, K-means, Spectral Clustering, MMC, LSI, LDA, GM, HMM, MRF, CRF, 21
  • 22. l l l l l l 22
  • 23. l l 2 1 IT 1 1 1 1 0.7 150 23 0
  • 24. xy / y y DB 24
  • 25. l l 25
  • 26. 1ECl l l l l l l l l 26
  • 27. 2l l l l l l l l 27
  • 28. 3l l l l l l l l l 28
  • 29. 4l SNS l l l // l l l l l l 29
  • 30. +l Google, MS, Yahoo! l l l l
  • 31. l l MRF CRF HMM , , l MAPl
  • 32. l l l l l l l Residual Splash Belief Propagation [J. E. Gonzalez AISTATS 2009] l GraphLab [Y. Low et. al. UAI 2010]
  • 33. [S. Singh LCCC 2010] 50 250 5x l NY Times 20100l 250
  • 34. l l F() l SVM : F() = iLhinge() + C2l MapReduce l l l l l IterativeParameterMixture
  • 35. Parameter Mixture1. Kshard2. shard3. shard l = (ii)/Kl l l
  • 36. 2 Distributed Gradientl l l l l l shard l
  • 37. 3Asynchronous Updatel l l lock
  • 38. 4 Iterative Parameter Mixture[Mann et al 09][Mcdonald et. Al. 10]l Parameter Mixture1. shard2. shard3. 4. shard1 epoch Shard
  • 39. 1 [K. Hall LCCC 2010] 37000 200 240workerMapReduce Iterative Parameter Mixture70
  • 40. 2 [K. Hall LCCC 2010] Single-node 16 900 600workerMapReduce Iterative Parameter Mixture
  • 41. l l Preferred Infrastructure l l l l l Jubatus
  • 42. Jubatus42
  • 43. Jubatusl l 1) l 2) l 3)l (MapReduce/Hadoop l l l l / (CEP l l 43
  • 44. Jubatus l NTT SIC*Preferred Infrastructure l 201110OSS http://jubat.us/ 44* NTT SIC: NTT
  • 45. 1: / l l / l twitter6000QPSl l l 45
  • 46. 1: / l l / l twitter6000QPSl l l 46
  • 47. 2: l l l l l 47
  • 48. l l l l l 48
  • 49. Jubatusl l l l l l 49
  • 50. l l l l Jubatus l l Perceptron (1958) l Passive Aggressive (PA) (2003) l Confidence Weighted Learning (CW) (2008) l AROW (2009) l Normal HERD (NHERD) (2010) 50
  • 51. l l 51
  • 52. l Jubatusl l l l l mix 52
  • 53. Jubatus l l l 53
  • 54. Jubatusl l l MapReduceShufflel l l l
  • 55. l Jubatus l l c.f. MapReduceMapReducel UPDATE l l ANALYZE l l MIX l 55
  • 56. UPDATE l l 56
  • 57. ANALYZE l l 57
  • 58. MIX l l l 58
  • 59. l l (sum)(count)l UPDATE l sumi += x l counti += 1l ANALYZE l return (sum / count)l MIX l sum = sum1 + sum2 l count = count1 + count2 59
  • 60. l l l 100% l l l MIX 60
  • 61. Jubatusl , l l :Perceptron, PA, CW, AROWl l :PAl l Inverted File Index, LSHl l 61
  • 62. Jubatusl Jubatusl Model lm1 lm2 lm3 l (lm) l (sm sm sm sml Update l lm x lm lml Analyze l x lm sm yl Mix l lm lm sm sm sm
  • 63. + l Model l lm : wlocalRm l sm : wshareRml Update l wlocal, : x Rm, :y{+1, -1} l wlocal := wlocal + yxl Analyze l (wlocal + wshare)Tx 0 +1-1l Mix l wlocalwsharel Regretl
  • 64. l Model l lm : CHT xlocalRm l sm : bit signature (LSH, minhash bshareRkl Update l xxsignaturel Analyze l bsharel Mix l xlocalbit signature
  • 65. Jubatusl l l l l l l l l
  • 66. Jubatusl Jubatusl C++l Pythonl Rubyl Javal l IDL l Haskelmsgpack idl 66
  • 67. Jubatusl l l l l l l l l l l Jubatus 67
  • 68. 68
  • 69. 1/2l l l l l l l l l l l
  • 70. 2/2l l l l OpenXC Project l ArduinoAndroidOSS l JSONl
  • 71. Copyright 2006-2012Preferred Infrastructure All Right Reserved.