hardcore data science - in practice

Post on 06-Jan-2017

3.771 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Hardcore Data Science—in PracticeDr. Mikio L. Braun, Delivery Lead for Recommendation and Search

StrataConf 2016, London mikio.braun@zalando.de

@mikiobraun tech.zalando.com

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

• 15 countries, 3 warehouses, 16+ million customers, 3bn€ revenue in 2015, …

• Heavily using data science for recommendation

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Recommendations

Data Driven Recommendations• Collaborative

filtering • Content based

recommendation • Personalised

recommendations • …

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

For Example, One-pass Ranking Models

(Freno, Jenatton, Saveski, Archambeau, “One-Pass Ranking Models for Low-Latency Product Recommendations”, KDD 2015)

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Hardcore Data Science to Production• Usually one shot

computation • Sometimes done

in Python • Getting raw data

hard initially

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Production System• Realtime system • Usually done in Java/

JVM based • Events and article data

continually upgraded

Data Science vs. Production• A/B Test ⇔

offline evaluation

• Iterate on data science part

• Iterate on the whole system!

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Data Scientists and Developers

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

DS&D: CodingVery different approaches to coding…

← developers

data scientists →

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

DS&D: Collaboration• What is the

most productive way?

• Ideally, interface on code, not just documentation

• Production logs often become data analysis input!

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Organization• Cross-functional

teams • Communication! • Microservices, at

Zalando: STUPS (Docker on AWS)

Mikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

Summary• “Static” Data Analysis vs. Production: Real-time,

frequently update & monitor. • Facilitate fast iteration of data analysis &

production system. • Data Scientists and Developers: Different

approaches, find a common ground • Organizations: Cross-functional teams, micro

servicesMikio Braun, Hardcore Data Science in Practice, Strata+Hadoop World 2016, London

top related