データサイエンティストのつくり方

37
データサイエンティストの つくり PFIセミナー 2012/09/13 株式会社Preferred Infrastructure リサーチャー&Jubatusチームリーダー 将平

Upload: shohei-hido

Post on 04-Dec-2014

29.369 views

Category:

Technology


0 download

DESCRIPTION

2012/09/13 PFIセミナー「データサイエンティストのつくり方」資料 Ustreamの録画はこちらです→ http://www.ustream.tv/recorded/25376704

TRANSCRIPT

  • 1. PFI 2012/09/13 Preferred InfrastructureJubatus

2. l HIDO Shoheil TwitterID: @slal l l 2006: l l 2006-2012: IBM l () l l 2012-: l Jubatus 2 3. l 2010l 2011l 2012 Google350030002500200015001000 500 0 3 4. http://www.computerworld.jp/topics/1468/204704/http://www.computerworld.jp/topics/1468/204705/ 4http://www.computerworld.jp/contents/204769 5. 5 6. 6 7. l l l l l l 7 8. 9. Data science Applied statistics Data scientist Applied statistician 9 10. Data science: Wikipedia l Data science: l l Data scientist: practitionerl Data sciencel l 2-3l 2-3 l l Data science 10http://en.wikibooks.org/wiki/Data_Science:_An_Introduction/ 11. Data sciencel William S. Cleveland, Data Science: An Action Plan for Expandingthe Technical Areas of the Field of Statistics, 2001.l l l l ," ,"15% 20%,"25% ,"20% ,"15%11,"5%http://cm.bell-labs.com/cm/ms/departments/sia/doc/datascience.pdf 12. Data scientist: math and statisticsl For Todays Graduate, Just One Word: Statistics, NYT, 2009 l GoogleHal Varian10 l l l Web, Netfilx Challenge, IBMBAOl What is data science?, OReilly, 2010l Data products and Data-driven appsl CDDBData productl Googlel http://radar.oreilly.com/2010/06/what-is-data-science.html12http://www.nytimes.com/2009/08/06/technology/06stats.html 13. Data scientist: for data jiujitsuThe ability to take data to be able to understand it,to process it, to extract value from it, to visualize it, tocommunicate it thats going to be a hugelyimportant skill in the next decades. Hal Varian, Google 13 14. Data scienceor R Data scientist or 15. : Wikipedial l l l l l 15http://www.st.keio.ac.jp/learning/0611.html 16. Rl l ESTRELA 200382009772l Rl R 2013 16 17. l Data scientistBig datal Big datal WikipediaData sciencel Data scientistl Big datal Data scienceBig datal l l 17http://www.computerworld.jp/topics/617/201766/ 18. 19. 20. 1.2.3.4.5.6.7.8.9.10. 11. 12. 13. 14. PDCA 21. (1) 22. (2) 23. (3) 24. (4) UP 25. (5) 26. (6) WekaLIBSVM SVM 27. :Twitterl l Data Scientist @ Twitter x 3l Principal Data Scientist @LinkedInl Data Scientist @Cloudera, creator of Crunchl Data scientist, blogger, and R evangelist at Revolution Analyticsl l DeNAl J!NSl Albertl ( @ Treasure Data ) 28. (1)l 2100 l HadoopMahoutR l http://rikunabi-next.yahoo.co.jp/tech/docs/ct_s03600.jsp?p=001829 29. (2)l GREE (CEDEC 2011) l PDCA l GREE Analyticshttp://gigazine.net/news/20110914_gree_howto_cedec2011/ 30. Chief Analytics Officerl CAOl IT CAO IT 31. ()l l l 10http://ci.nii.ac.jp/naid/110008722771 32. 33. 34. 1: , , 15%20% 20%15% ,,25% 25% 20%20% 15% R5%, 5%, 15% 35. l l l BI5 l KPI 4 3 R/Matlab/l 2 Weka l 1 0l Hadoop/ l RHadoopOSS NoSQL l BI l DWH/BI/ l 36. 2:l l 543 R/Matlab/ 2 Weka1 BI0 Hadoop/ NoSQLDWH/BI/ 37. l l l l l l