[빅데이터 컨퍼런스 전희원]

15
Next Revolution Toward Open Platform R and RHive in Data Scientist’s toolbox NexR Data Scientist Jeon Hee-Won

Upload: jayoung-lim

Post on 29-Nov-2014

5.602 views

Category:

Documents


5 download

DESCRIPTION

 

TRANSCRIPT

  • 1. Next RevolutionToward Open Platform R and RHivein Data Scientists toolbox NexR Data Scientist Jeon Hee-Won

2. - . ? ? .Next RevolutionToward Open Platform -2- 3. SAS understands why RA key benefit of R is that it provides near instantavailability of new and experimental methods created byits user base without waiting for thedevelopment/release cycle of commercial software. SASrecognizes the value of R to our customer base Michael Gilliland, Product Marketing Manager SAS Institute, Inc.Next RevolutionToward Open Platform -3- 4. Using RR / .http://www.kdnuggets.com/2011/08/poll-languages-for-data-mining-analytics.htmlhttp://blog.revolutionanalytics.com/2011/11/r-still-the-preferred-tool-of-predictive-modelers-competing-at-kaggle.html Next Revolution Toward Open Platform-4- 5. R / ff, bigmemory, RevoScaleR GB 10GB gc(), rm() 32 , 2^31-1 R 2.15 2^51 No int64 TB int64 package from Google 64bit Single Core CPU 1 . R 2.14 parallel Next Revolution Toward Open Platform-5- 6. Motivation of RHiveselect * from foo;Map/Reduce for data analysis? . SQL for data analysis! . . Next Revolution Toward Open Platform-6- 7. RHiveR Next Revolution Toward Open Platform -7- 8. RHive AnalyticsRHive Next RevolutionToward Open Platform-8- 9. Usages RHIVE(ETL)Network Virtual Machine Disk Volume OutputRHIVE Network Log VM LogDisk Volume Log(Aggregate) Account LevelAccount LevelAccount Level Network Log VM Disk Volume R(Plotting)RHIVER(Plotting) (Clustering) scale SEG SEG SEG SEG SEG VM VM Disk Volume Network Network scale Cluster Cluster Cluster Cluster ClusterVM VM Disk Volume Network Network SEG SEG Next RevolutionToward Open Platform-9- 10. SNA with CDR SNA? from https://www.facebook.com/notes/facebook-engineering/visualizing-friendships/469716398919Next RevolutionToward Open Platform -10- 11. Big Data Problems Big Data Next Revolution Toward Open Platform-11- 12. Solving . Map/Reduce, MPI,? multicore programming / vertex , edgesub-network RHive Next Revolution Toward Open Platform-12- 13. Group Network Tracking SNA . Next RevolutionToward Open Platform -13- 14. SNA Influential Customer Inbound (, node )Call Network Group Outbound Call Fisher Fishing , SMS Detection Network SNA Node Demographics , Network SNA Network () Next Revolution Toward Open Platform -14- 15. Q&[email protected] Next Revolution Toward Open Platform -15-