nvidia / pscds / upspr / ensae! laboratoire de statistique arnak dalalyan mdc / telecom paristech!...
TRANSCRIPT
![Page 1: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/1.jpg)
Pr / UPSud
LRI
CÉCILE GERMAIN
Pr / ENSAE
Laboratoire de Statistique
ARNAK DALALYAN
MdC / Telecom ParisTech
LTCI
ALEXANDRE GRAMFORT
1
NVIDIA / PSCDS / UPSACLAY MEETING
Center for Data ScienceParis-Saclay
March 30, 2015, LAL
MdC / Mines ParisTech
CGS
AKIN KAZAKÇI
DR / CNRS LAL & LRI
CNRS & University Paris-Sud
BALÁZS KÉGL
![Page 2: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/2.jpg)
Center for Data ScienceParis-Saclay
• Why we launched this meeting?
• The Paris-Saclay CDS
• challenges, courses, hackatons
• Where are we going?
• the subject of the discussion after the talks
2
OUTLINE
![Page 3: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/3.jpg)
Center for Data ScienceParis-Saclay
• We do research (both applied and basic) on deep learning
• We are running server-side-execution hackatons which need significant resources for very short periods
• typically 1 day every 6 weeks (not all of them are GPU-intensive)
• mutualization
• Discuss a Saclay-wise initiative on research on GPUs, find the actors, discuss collaboration with NVIDIA
• we, at the CDS, do not do research on GPUs, usually use them through standard ML libraries
• but mutualization can of course go beyond data science
3
WHY WE LAUNCHED THIS MEETING?
![Page 4: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/4.jpg)
Center for Data ScienceParis-Saclay
VIRTUALDATA@P2IO
4
![Page 5: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/5.jpg)
Center for Data ScienceParis-Saclay
DATA SCIENCE
5
Design of automated methods
to analyze massive and complex data
to extract useful information
![Page 6: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/6.jpg)
Center for Data ScienceParis-Saclay6
DATA SCIENCE=
BIG DATA
We are focusing on inference:
data knowledge
Interfacing with infrastructure, security, production
![Page 7: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/7.jpg)
7
UNIVERSITÉ PARIS-SACLAY
19 founding partners
![Page 8: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/8.jpg)
Center for Data ScienceParis-Saclay
UNIVERSITÉ PARIS-SACLAY
8
+ horizontal multi-disciplinary and multi-partner initiatives (“lidex”) to create cohesion
![Page 9: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/9.jpg)
Center for Data ScienceParis-Saclay9
Center for Data ScienceParis-Saclay
A multi-disciplinary initiative to define, structure, and manage the data science ecosystem at the Université Paris-Saclay
http://www.datascience-paris-saclay.fr/
Biology & bioinformaticsIBISC/UEvry LRI/UPSudHepatinovCESP/UPSud-UVSQ-Inserm IGM-I2BC/UPSud MIA/AgroMIAj-MIG/INRALMAS/Centrale
ChemistryEA4041/UPSud
Earth sciencesLATMOS/UVSQ GEOPS/UPSudIPSL/UVSQLSCE/UVSQLMD/Polytechnique
EconomyLM/ENSAE RITM/UPSudLFA/ENSAE
NeuroscienceUNICOG/InsermU1000/InsermNeuroSpin/CEA
Particle physics astrophysics & cosmologyLPP/Polytechnique DMPH/ONERACosmoStat/CEAIAS/UPSudAIM/CEALAL/UPSud
The Paris-Saclay Center for Data ScienceData Science for scientific Data
250 researchers in 35 laboratories
Machine learningLRI/UPSud LTCI/TelecomCMLA/Cachan LS/ENSAELIX/PolytechniqueMIA/AgroCMA/PolytechniqueLSS/SupélecCVN/Centrale LMAS/CentraleDTIM/ONERAIBISC/UEvry
VisualizationINRIALIMSI
Signal processingLTCI/TelecomCMA/PolytechniqueCVN/CentraleLSS/SupélecCMLA/CachanLIMSIDTIM/ONERA
StatisticsLMO/UPSud LS/ENSAELSS/SupélecCMA/PolytechniqueLMAS/CentraleMIA/AgroParisTech
Data sciencestatistics
machine learninginformation retrieval
signal processingdata visualization
databases
Domain sciencehuman society
life brain earth
universe
Tool buildingsoftware engineering
clouds/gridshigh-performance
computingoptimization
Data scientist
Applied scientist
Domain scientist
Data engineer
Software engineer
Center for Data ScienceParis-Saclay
datascience-paris-saclay.fr
@SaclayCDS
LIST/CEA
![Page 10: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/10.jpg)
Center for Data ScienceParis-Saclay10
THE DATA SCIENCE ECOSYSTEM
Data domainsenergy and physical sciences
health and life sciences Earth and environment
economy and society brain
Data scientist
Data trainer
Applied scientist
Domain scientistSoftware engineer
Data engineer
Data sciencestatistics
machine learning information retrieval
signal processing data visualization
databases
Tool building software engineering
clouds/grids high-performance
computing optimization
![Page 11: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/11.jpg)
Center for Data ScienceParis-Saclay
TOOLS
11
We are designing and learning to manage
tools
to accompany data science projects
with different needs
![Page 12: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/12.jpg)
Center for Data ScienceParis-Saclay
TOOLS: LANDSCAPE TO ECOSYSTEM
12
Data scientist
Data trainer
Applied scientist
Domain expertSoftware engineer
Data engineer
Tool building Data domains
Data sciencestatistics
machine learning information retrieval
signal processing data visualization
databases
software engineeringclouds/grids
high-performancecomputing
optimization
energy and physical sciences health and life sciences Earth and environment
economy and society brain
• interdisciplinary projects • matchmaking tool • design and innovation strategy workshops • data challenges
• coding sprints • Open Software Initiative • code consolidator and engineering projects
• bootcamps / hackathons • IT platform for linked data • annotation tool • SaaS data science platform
![Page 13: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/13.jpg)
Center for Data ScienceParis-Saclay
TWO ANALYTICS TOOLS
13
RAPID ANALYTICS AND MODEL PROTOTYPING
DATA CHALLENGES
![Page 14: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/14.jpg)
Center for Data ScienceParis-Saclay
• A data challenge is a recently developed unconventional dissemination and communication tool
• a scientific or industrial data producer arrives with a well-defined problem and a corresponding annotated data set
• defines a quantitative goal
• makes the problem and part of the data set (the training set) public on a dedicated site
• data science experts then take the public training data and submit solutions (predictions) for a test set with hidden annotations
• submissions are evaluated numerically using the quantitative measure
• contestants are listed on a leaderboard
• after a predefined time, typically a couple of months, the final results are revealed and the winners are awarded
14
DATA CHALLENGES
![Page 15: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/15.jpg)
Center for Data ScienceParis-Saclay
• Challenges are useful for
• generating visibility in the data science community about novel application domains
• benchmarking in a fair way state-of-the-art techniques on well-defined problems
• finding talented data scientists
• Limitations
• not necessary adapted to solving complex and open-ended data science problems in realistic environments
• no direct access to solutions and data scientist
• emphasizes competition
15
DATA CHALLENGES
![Page 16: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/16.jpg)
Center for Data ScienceParis-Saclay16
• Single-day coding sessions
• 20-30 participants
• preparation is similar to challenges
• Goals
• focusing and motivating top talents
• promoting collaboration, speed, and efficiency
• solving (prototyping) real problems
RAPID ANALYTICS AND MODEL PROTOTYPING
![Page 17: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/17.jpg)
17
RAPID ANALYTICS AND MODEL PROTOTYPING
![Page 18: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/18.jpg)
Center for Data ScienceParis-Saclay18
RAPID ANALYTICS AND MODEL PROTOTYPING
![Page 19: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/19.jpg)
19
RAPID ANALYTICS AND MODEL PROTOTYPING
![Page 20: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/20.jpg)
20
RAPID ANALYTICS AND MODEL PROTOTYPING
![Page 21: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/21.jpg)
Center for Data ScienceParis-Saclay21
ANALYTICS TOOLS TO PROMOTE COLLABORATION AND CODE REUSE
![Page 22: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/22.jpg)
Center for Data ScienceParis-Saclay
RAPID ANALYTICS AND MODEL PROTOTYPING
22
2015 Jan 15 replaying the
HiggsML challenge
![Page 23: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/23.jpg)
Center for Data ScienceParis-Saclay
2015 Feb 9 Mortality prediction in septic patients
RAPID ANALYTICS AND MODEL PROTOTYPING
23
![Page 24: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/24.jpg)
Center for Data ScienceParis-Saclay
RAPID ANALYTICS AND MODEL PROTOTYPING
24
2015 Apr 10 Classifying variable stars
![Page 25: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/25.jpg)
Center for Data ScienceParis-Saclay
RAPID ANALYTICS AND MODEL PROTOTYPING
25
2015 May Drug identification from spectra
![Page 26: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/26.jpg)
Center for Data ScienceParis-Saclay
RAPID ANALYTICS AND MODEL PROTOTYPING
26
2015 June Insect classification
![Page 27: NVIDIA / PSCDS / UPSPr / ENSAE! Laboratoire de Statistique ARNAK DALALYAN MdC / Telecom ParisTech! LTCI ALEXANDRE GRAMFORT 1 NVIDIA / PSCDS / UPSACLAY MEETING e y March 30, 2015, LAL](https://reader033.vdocuments.pub/reader033/viewer/2022042909/5f3c731c8d349c200e3fb1cc/html5/thumbnails/27.jpg)
Center for Data ScienceParis-Saclay27
THANK YOU!