recommendation @deezer
TRANSCRIPT
RecSysFr #3Recommendation @Deezer
RecSysFr, Paris, 2016 June 22th
B. Mathieu, Head of Data Science
Deezer
/01
RecSysFr #3
Deezer overview
RecSysFr #3
● 420 employees in 20 cities● 5M albums● 40M tracks● 100M playlists
● 16M MAU● 6M subscribers
● ~500 servers● 4.5 PB storage for audio files● 1.5 TB of logs / day● ~1B requests / day● ~30k new albums each week
● Hadoop cluster with 1.5PB storage, 4TB RAM, 1000+ vcores
Some technical numbers
RecSysFr #3
Recommendation opportunities
/02
RecSysFr #3
● Interactive recommendation
● Understand user feedbacks
Interactive Radios
Algorithms and Evaluation
/03
RecSysFr #3
RecSysFr #3
Architecture overview
Content data:- Tags- Popularity
User data:- Taste model- Hot tracks- Behaviors
Build tracklist
- Data cache- User action history
- Update user models- Consolidate tags data- Build indexes
actions logs
RecSysFr #3
● % users listening more than 10mn● % users who reconnect more than 3
days last week● % users who do a like / dislike
=> take care of statistical confidence !
A/B Tests evaluation metrics
● A/B tests are costly, long● Want to test more cases
Offline testing:● setup benchmarking methodology● Freeze data and evaluate algos with user future actions
RecSysFr #3
Offline testing / benchmarking
Offline Testing
User Study
AB Testing
Candidates Best Offline Candidates
Best User Studies Candidates
Final choice
2013 - Shany, Gunawardana
Thanks for your attention
Enjoy RecSysFr #3 @Deezer !
http://www.deezer.com/jobs