digital transformation: big data and data science learning path
TRANSCRIPT
Chula DataScienceCenterofExcellenceinMulti-Disciplinary
BigDataAnalytics
BigDataandDataScienceLearningPath
DigitalTransformation #แบ่งปัน
HeadofDepartmentDept.ofComputerEngineeringFacultyofEngineeringChulalongkorn University
[email protected]@natawutnhttp://natawutn.wordpress.comhttp://www.slideshare.net/natawutnupairoj
Asst.Prof.NatawutNupairoj,Ph.D.
DataScience=Sensors+BigData+DataAnalytics
TheNewEquation
DataAnalyticsSimplified
Descriptive • “A.Natawut drinksabout1cupofcoffeeaday”
Diagnostic• “NumberofcupsthatA.Natawut drinksdependonnumberofmeetingshehaseachday”
Predictive• “Tomorrow,A.Natawut has2meetings,itisverylikelythatA.Natawut willdrink2cupstomorrow”
Prescriptive• “Informsecretarytoprepare1cupinthemorningandoneintheafternoonforA.Natawut”
Sensors=App/IoT /SocialNetwork
BigData=ProcessingCapabilities
DataAnalytics=Domain-OrientedMachineLearning
IntroducingFDA-ApprovedIngestibleSensorsinPills
http://www.forbes.com/sites/singularity/2012/08/09/no-more-skipping-your-medicine-fda-approves-first-digital-pill/
Casestudy:PredictivePolicing
Beingusedby60citiesintheUSe.g.Atlanta,LA,etc.
Source:http://www.forbes.com/sites/ellenhuet/2015/02/11/predpol-predictive-policing
NHKDocumentary:DisasterBigData- Keytorecovery
KeyQuestion
“Howmanypeoplearestillresidedineacharea?”
Challenges
• Howtoprocessbigdata?• 122Msubscribers+2.5yearsofdata=200TB-300TB
• Howtoanalyzedata?• Whatisthedefinitionofbeing“residence”?• Howtosamplingmobilesubscriberscorrectly?
• Howcanweunderstandtheresults?• Howtovisualizedata?• Howtotellstory?
“DataScienceisaTeamSport”– DJPatil
DomainKnowledge
Math&Statistics
ComputerScience
DataScientist
StatisticalResearchDataProcessing
MachineLearning
DataScientistSkillsintheContextofNHKDocumentary
DomainKnowledge
Math&Statistics
ComputerScience
StatisticalResearchDataProcessing
MachineLearning
• Howtostore300TBofdata?• Howtoprocess300TB
effectively?• HowaboutDataCleansing?• Howtovisualizedata?
• Howtosampledatacorrectly?• Howtoturngeolocationinto
structureddata?• Howtopredictpopulation
accurately?
• Howtodefine“residence”?• Howtoclassifylocalpeople
fromworkers?• Howtoutilizetheseresults?
ModernDataScienceTeam
Source:http://www.slideshare.net/continuumio/why-open-data-science-matters-gartner-bi-analytics-summit-16
Understanding/Preparation/Modeling/Evaluation
Deployment
http://nirvacana.com/thoughts/becoming-a-data-scientist/
MostIn-DemandSkillsforDataScientistin2016
Source:https://www.crowdflower.com/what-skills-should-data-scientists-have-in-2016/
FinalThoughts
• AGoodDataScientistCommunicatesEffectivelyToBusinessUsers• AGoodDataScientistKnowsYourBusiness• AGoodDataScientistUnderstandsStatisticalPhenomena• AGoodDataScientistMakesEfficientPredictions• AGoodDataScientistProvidesProduction-ReadySolutions• AGoodDataScientistCanWorkOnAMassScale
https://blog.dataiku.com/2013/11/10/the-six-core-skills-of-a-data-scientist
Chula DataScienceCenterofExcellenceinMulti-Disciplinary
BigDataAnalytics