![Page 1: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/1.jpg)
![Page 2: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/2.jpg)
SYSTEM TEARDOWN: SOLR AS A PRACTICAL RECOMMENDATION ENGINE Michael Hausenblas Chief Data Engineer EMEA, MapR Technologies Twitter: @mhausenblas
![Page 3: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/3.jpg)
What does Machine Learning look like?
![Page 4: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/4.jpg)
A1 A2!"
#$TA1 A2
!"
#$=
A1T
A2T
!
"
%%
#
$
&&
A1 A2!"
#$
=A1
TA1 A1TA2
AT2A1 AT
2A2
!
"
%%
#
$
&&
r1r2
!
"%%
#
$&&=
A1TA1 A1
TA2
AT2A1 AT
2A2
!
"
%%
#
$
&&
h1h2
!
"%%
#
$&&
r1 = A1TA1 A1
TA2!"%
#$&h1h2
!
"%%
#
$&&
What does Machine Learning look like?
O(κ k d + k3 d) = O(k2 d log n + k3 d) for small k, high quality O(κ d log k) or O(d log κ log k) for larger k, looser quality
![Page 5: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/5.jpg)
• Observation of interactions between users taking actions and items for input data to recommender model
• Goal: suggest additional appropriate or desirable interactions
• Example applications: – similar movie, music, books (topic, style, etc.) – map-based restaurant choices – suggesting sale items for e-stores or cash-register
receipts
Recommendations as Machine Learning
![Page 6: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/6.jpg)
Recommendations
Recap: Behavior of a crowd helps us understand what individuals will do
![Page 7: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/7.jpg)
Recommendations
Alice got an apple and a puppy
Charles got a bicycle
Alice
Charles
![Page 8: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/8.jpg)
Recommendations
Charles got a bicycle
Bob got an apple
Alice
Bob
Charles
Alice got an apple and a puppy
![Page 9: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/9.jpg)
Recommendations
What else would Bob like?
Alice
Bob
Charles
?
![Page 10: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/10.jpg)
Recommendations
A puppy, of course!
Alice
Bob
Charles
![Page 11: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/11.jpg)
You get the idea of how recommenders work …
![Page 12: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/12.jpg)
Recommendations
What if everybody gets a pony?
?
Alice
Bob
Charles
Amelia What else would you recommend for Amelia?
![Page 13: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/13.jpg)
Recommendations
?
Alice
Bob
Charles
Amelia If everybody gets a pony, it’s not a very good indicator of what to else predict ...
![Page 14: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/14.jpg)
• Very popular items co-occur with everything – Examples: welcome document; elevator music
• Very widespread occurrence is not interesting as a way to generate indicators
– Unless you want to offer an item that is constantly desired, such as razor blades
• What we want is anomalous co-occurrence – This is the source of interesting indicators of preference on which to base
recommendation
Problems with Raw Co-occurrence
![Page 15: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/15.jpg)
1. Use log files to build history matrix of users x items – Remember: this history of interactions will be sparse compared to all potential
combinations
2. Transform to a co-occurrence matrix of items x items
3. Look for useful co-occurrence by looking for anomalous co-occurrences to make an indicator matrix – Log Likelihood Ratio (LLR) can be helpful to judge which co-occurrences can with
confidence be used as indicators of preference – RowSimilarityJob in Apache Mahout uses LLR
Get Useful Indicators from Behaviors
![Page 16: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/16.jpg)
Log Files
Alice
Bob
Charles
Alice
Bob
Charles
Alice
![Page 17: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/17.jpg)
Log Files
u1
u3
u2
u1
u3
u2
u1
t1
t4
t3
t2
t3
t3
t1
![Page 18: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/18.jpg)
Log Files and Dimensions
u1
u3
u2
u1
u3
u2
u1
t1
t4
t3
t2
t3
t3
t1
t1
t2
t3
t4
Things
u1 Alice
Bob Charles
u3 u2
Users
![Page 19: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/19.jpg)
History Matrix: Users by Items
Alice
Bob
Charles
✔ ✔ ✔ ✔ ✔
✔ ✔
![Page 20: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/20.jpg)
Co-occurrence Matrix: Items by Items
-‐
1 2 1 1
1 1
2 1
How do you tell which co-‐occurrences are useful?
0 0
0 0 Use LLR test to turn co-‐occurrence into indicators…
![Page 21: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/21.jpg)
Co-occurrence Binary Matrix
1 1 not
not
1
![Page 22: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/22.jpg)
Spot the Anomaly
A not A
B 13 1000
not B 1000 100,000
A not A
B 1 0
not B 0 2
A not A
B 1 0
not B 0 10,000
A not A
B 10 0
not B 0 100,000
What conclusion do you draw from each situa9on?
![Page 23: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/23.jpg)
• Root LLR is roughly like standard deviations • In Apache Mahout, RowSimilarityJob uses LLR
Spot the Anomaly
A not A
B 13 1000
not B 1000 100,000
A not A
B 1 0
not B 0 2
A not A
B 1 0
not B 0 10,000
A not A
B 10 0
not B 0 100,000
0.90 1.95
4.52 14.3
![Page 24: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/24.jpg)
Indicator Matrix: Anomalous Co-cccurrence
✔ ✔
Result: The marked row will be added to the indicator field in the item document …
Significant co-‐occurrences! indicators
![Page 25: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/25.jpg)
Indicator Matrix
✔ id: t4 title: puppy desc: The sweetest little puppy ever. keywords: puppy, dog, pet indicators: (t1)
That one row from indicator matrix becomes the indicator field in the Solr document used to deploy the recommenda@on engine
Note: data for the indicator field is added directly to meta data for a document in Solr index. You don’t need to create a separate index for the indicators.
![Page 26: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/26.jpg)
Demo time!
![Page 27: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/27.jpg)
Internals of the Recommender Engine
27
![Page 28: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/28.jpg)
What to recommend if new user listened to 2122: Fats Domino & 303: Beatles? Recommendation is “1710 : Chuck Berry”
Looking Inside LucidWorks
28
![Page 29: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/29.jpg)
![Page 30: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/30.jpg)
Metrics and logs (5)
Cooccurrence analysis (7)
Post to search
engine (8)
Search engine (4)
Presentation tier (2)
User behavior generator (1)
Session collector
(3)
History collector (6)
Diagnostic browsing (9)
http://bita.ly/18vbbaT
![Page 31: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/31.jpg)
Example: search based recommendation
![Page 32: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/32.jpg)
• Sample Query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local Top40
• Sample Document – Merchant Id – Field for text description – Phone – Address – Location
– Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local Top40
original data and meta-‐data
derived from co-‐occurrence analysis
recommendaRon query
Search-based recommendation
![Page 33: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/33.jpg)
SolR Indexer SolR
Indexer Solr
indexing Co-‐occurrence
(Mahout)
Item meta-‐data Index shards
complete history
Analyze with MapReduce
![Page 34: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/34.jpg)
SolR Indexer SolR
Indexer Solr
search Web Rer
Item meta-‐data Index shards
user history
Deploy with Conventional Search System
![Page 35: 2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine](https://reader033.vdocuments.pub/reader033/viewer/2022052505/55502e9db4c9058f2f8b4d12/html5/thumbnails/35.jpg)
• Kudos to Ted Dunning, Grant Ingersoll and LucidWorks, for the idea & the demo!
• Get in touch: Twitter—@mhausenblas, @MapR
• Ah, and, btw: we’re hiring ;)
Outro