jubatusが目指すインテリジェンス基盤
DESCRIPTION
IEICEソサエティ大会2013 "知的環境を実現するビッグデータ解析と通信行動分析"セッションでの講演内容です。TRANSCRIPT
- 1. Jubatus Preferred Infrastructure
- 2. l NTT SIC*Preferred Infrastructure l 201110OSS http://jubat.us/ Jubatus * NTT
- 3. l l l Jubatus l l Jubatus Agenda
- 4. Preferred Infrastructure (PFI) 4 IR l l 20063 l l Sedue: l Bazil: l Jubatus:
- 5. l 2622/ l /// l Ex- Sony IBM Yahoo! Sun mixi GREE l IPA 5 l ICPC7ICFP l TopCoder RedCoder 3 (25) l l Hadoop, l Hadoop, Haskell, l l 2 l 2013 5
- 6. l l l Jubatus l l Jubatus Agenda
- 7. 7 l l /Web//Twitter l l Web l l // l l l M2M l
- 8. Volume Variety Velocity 8 Complex Event Processing Hadoop NoSQL M2M
- 9. 9 SQL DWH BI CEP M/RCQL (Machine Learning)
- 10. 10
- 11. 11 DB
- 12. l IBM2012 l 24% l 47% l 6% l or IBM Institute of Business Value Analytics: The real-world use of big data, 2013
- 13. 3 1. l l 2. l DB l DB 3. l l
- 14. Jubatus l (MapReduce/Hadoop l l l l / (CEP l l 14 1. 3. 2.
- 15. 15 WEKA 1993- SVM light 1998- Mahout 2006- Jubatus // 2011 Structured Perceptron [Collins, EMNLP 2002] Passive Aggressive / MIRA 2004 online-learning library [, 2008]
- 16. Google GFS/MapReduce (Hadoop) [Google 2004] + MapReduce Chubby (Zookeeper) [Google 2006] , DB/ BigTable (HBase) [Google 2006] KVS Dynamo [Amazon 2007] KVS MegaStore [Google 2011] KVS OLAP/ Hive [Facebook 2009] SQLHadoop Dremel (Apache Drill) [Google 2010] OLAP, + PowerDrill [Google 2012] OLAP + + + OSS, Spanner [Google 2012]
- 17. l l l Jubatus l l Jubatus Agenda
- 18. l l l l l l l l l l 18 Dimensionality Reduction by Learning an Invariant Mapping Raia Hadsell, Sumit Chopra, Yann LeCun, CVPR, 2006
- 19. Jubatus l () l Perceptron / PA / CW / AROW / NHERD l l PA-based regression l l LSH / MinHash / Euclid LSH l l l l LOF l l / (PageRank)
- 20. l xy l {(x, y)}xy 20 x y or or Twitter Tweet
- 21. 21 Jeopardy!
- 22. JubatusTwitter l NTT DataTwitter Japan l FirehoseTweet l JubatusAPI 22 http://blog.jp.twitter.com/2012/09/twitter.html http://www.nttdata.com/jp/ja/news/release/2012/092700.html
- 23. JubatusNEDO IT l l 23 NEDO: IT
- 24. Jubatus l l l l l l l l l Jubatus l l 24
- 25. Jubatus l l l MRI l l l l l 25
- 26. 26
- 27. Overview 27 On-Disk Instance On-Memory Instance Fluentd Realtime Analysis Server JubatusData Source Web Server + Visualization Tool Kit
- 28. l l l l l l l l 28
- 29. 29
- 30. l l Sedue for BigData 30 2013/08/15 12:08:30.200
- 31. l l l Jubatus l l Jubatus Agenda
- 32. l l l y = a x + b l l l (y =+1, x=+2) a x + b = 2a + b y > 0 ab l l (x = -5) y = a x + b = -5a + b Model x y Model x y
- 33. w1 w2 wn 33
- 34. lLSHMin Hash 011010010 110001100 110010111 000100101 110101011 000010110 1 2 3 4 5 6 34
- 35. LLLL LLLL L Update LLL Update Update Update time = 1 2 3
- 36. l Jubatus l l l l l MIX UPDATE ANALYZE
- 37. UPDATE l l 1 or 2 l l Local model 1 Local model 2 Initial model Initial model 37
- 38. MIX l l l Local model 1 Local model 2 Mixed model Mixed model Initial model Initial model = = Model diff 1 Model diff 2 Initial model Initial model - - Model diff 1 Model diff 2 Merged diff Merged diff Merged diff + + = = = + 38
- 39. ANALYZE l l l Mixed model Mixed model 39
- 40. JubatusMIX w1 w2 wn MIX w w w w = 1 n w1 ++ wn( ) MIX 40
- 41. JubatusMIX LSHMin Hash 011010010 110001100 110010111 000100101 110101011 000010110 1 2 3 4 5 6 011010010 000010110 1 6 : 011010010 000010110 1 6 : 011010010 000010110 1 6 : Mix 41
- 42. l l l Jubatus l l Jubatus Agenda
- 43. Edge-heavy 43 , Edge-Heavy Data: CPS GICTF 2012, http://www.gictf.jp/doc/20120709GICTF.pdf
- 44. Edge-Heavy Data: 44 l l l exhaust data l l l l , ,
- 45.
- 46. Edge-Heavy Data(1) Jubatus on OpenBlocks (ARM) 46 http://obdnmagazine.blogspot.jp/2012/11/jubatusopenblocks-ax3_21.html
- 47. Edge-Heavy Data(2) l l l ZigBee l l l LIDAR l l FPGA l GPGPU l CUDA l Xeon Phi l x86
- 48. Edge-Heavy Data Edge-Heavy Data HW 1.SSD 2. 3. l GFS + MapReduce 4. 5. l BI-4-1Krill: PFI 48
- 49. l l l l P2P l HW/NWSW l SSD l l HW/NW/SW l l GoogleIBMOracleIntel + l