elasticseach in outbrain recommender system - looking at content recommendations through a search...

60
Looking at Content Recommendati on through a Search Lens

Upload: sonya-liberman

Post on 18-Feb-2017

87 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

Looking at Content Recommendation through a Search Lens

Page 2: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

2People want GREAT

content

Page 3: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens
Page 4: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens
Page 5: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens
Page 6: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens
Page 7: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

7

Content Recommendation EngineRelevance

Rec Engine

Content Inventory

Page 8: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

8

Challenges

• Personalization

• A Jungle of Market RulesGeo targeting, publisher blacklisting of sites, URLs, titles

• Scale 35K req/sec, 50ms latency, millions of potential content recs

Page 9: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

9

Search EnginesWhat can they do?

Page 10: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

10

1. Score documents by relevance to query

Relevance

Query

Donald

Trump Search Engine

Page 11: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

11

2. Filter documents by certain attributes

Page 12: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

12

3. Work Efficiently and at Scale

Page 13: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

13

3. Work Efficiently and at Scale

what the day brings

Page 14: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

14

3. Work Efficiently and at Scale

what the day brings

Page 15: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

15

3. Work Efficiently and at Scale

what the day brings

Page 16: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

16

3. Work Efficiently and at Scale

Open Source

Distributed

Scalable

RESTful

Real-time search

Page 17: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

17

3. Work Efficiently and at Scale

Page 18: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

18

How Do we Reduce the Problem of Recommending Content to

Users to a Search Problem?

Page 19: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

19

John, www.angelina.com

Television and Celebrities

Blacklist Site:www.brad.com

Translate user and context to a query of interests and market rules

Page 20: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

20

Translate articles to searchable documents in the same feature space of user interests and market rules

Is about: Celebrities

site:www.brad.com

Breakup: What’s Next?

Brad's acting career

continues to flourish while he films a

new …

Page 21: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

21

What is a Document About?

Semantic Features

CategoriesEntertainment/Television

TopicsStory, Murder, Television

EntitiesDolores, Westworld, HBO

NLP

Page 22: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

22

Constructing a User Profile

Time

User Profile

Page 23: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

23

User Profile

Page 24: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

User Profile

Page 25: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

25

Page 26: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

26

Page 27: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

27

Page 28: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

28

Page 29: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

29

Page 30: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

30

Page 31: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

31

Page 32: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

32

Page 33: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

33

Indexing Our Inventory to Elasticsearch Every ES document has one or more fields

Fields can be of different types

• Strings• Numeric• Boolean• Array of [stings | numbers | …]

Page 34: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

34

Indexing Our Inventory to Elasticsearch Every article becomes an ES documentEvery article feature becomes a field{ "title" : "Westworld season 1 ends with explosive finale", "categories" : ["entertainment_television"], "topics" : ["story", "murder", "television"], "entities" : ["dolores", ”westworld", ”hbo"]}

Page 35: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

Querying Elasticsearch

{ "query": { "filtered": { "query": { "term": {”category": ”celebrities" } },

”filter": { "term": {"site": "www.cnn.com" } } }}

Page 36: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

36

{ "query": { "bool": { "should": [ {"terms":{ "categories": ["television", ”celebrities"]} }, {"terms":{ "topics": ["business", "cinema", "murder"]} },

{"terms":{ "entities": [”hbo", ”dolores", ”nyse"]} } ] } }}

Create Elasticsearch Query with User Interests

Page 37: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

37

{ "query": { "bool": { "should": [ { "terms": { "categories": { "query": "television", "boost": 2.3 }}}, { "terms": { "categories": { "query": "investments", "boost": 1.6 }}}, { "terms": { "entities": { "query": ”dolores", "boost": 1.2 }}} ]}}}

Using Weights to Improve Relevance

Page 38: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

38

{ "query": { "bool": { "should": [ {"terms":{ "categories": "?"}}, {"terms":{ "topics": "?"}},

{"terms":{ "entities": "?"}}}}}]

What about Cold-Start Users?

Page 39: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

39

What about Cold-Start Users?

Display the most popular content

How? Index popularity score{ "title" : "Westworld season 1 ends ..", "categories" : ["entertainment_television"], "popularity" : 0.6}

{ "title" : "10 Best NY Resturants", "categories" : ["lifestyle/food"], "popularity" : 0.3}

Page 40: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

40

What about Cold-Start Users?

Score by this field in the query

{ "query": { "function_score": { "query": { "match_all": {} }, "field_value_factor": { "field": "popularity" }, "boost_mode": "replace"}}}

Page 41: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

41

Query with Blacklisted Sites

”www.angelina.com"

Blacklisted: ”www.brad.com”

From Market Rules to Elasticsearch Filters

Page 42: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

42

Query with Blacklisted Sites

{ "must_not": [ { "terms": { "site": [

“www.brad.com”,]}}]}

www.angelina.com

Page 43: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

43

{ "must_not": [ { "terms": { "site": [

“www.brad.com”,]}}]}

{ "title" : ”Breakup: what’s next?", ”site" : ”www.brad.com”}

Query with Blacklisted Sites

Page 44: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

44

{ "must_not": [ { "terms": { "site": [

“www.brad.com”,]}}]}

{ "title" : ”Breakup: what’s next?", ”site" : ”www.brad.com”}

Document is Filtered

Out

Query with Blacklisted Sites

Page 45: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

45

{ "title" : ”Breakup: what’s next?", ”site" : ”www.brad.com”}

{ "must_not": [ { "terms": { "site": [

“www.brad.com”,]}}]}

{ "title" : ”Top news of the week", ”site" : “www.cnn.com”}

Document Passes Filter

Query with Blacklisted Sites

Page 46: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

46

From Market Rules to Elasticsearch FiltersGeo Targeting

”Music World – everything on NY Music Scene "

Targeting "US" users only

Page 47: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

47

Index Geo Field in the Document

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : ["us"]}

Page 48: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

48

Add a Geo Filter to the Query

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us"]} }}}}

Page 49: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

49

Apply Filter on Documents

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us"]} }

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : ["us"]}

Page 50: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

50

Apply Filter on Documents

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us"]} }

Document Passes Filter

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : ["us"]}

Page 51: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

51

Apply Filter on Documents

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["fr"]} }

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : ["us"]}

Page 52: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

52

Apply Filter on Documents

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["fr"]} }

Document is Filtered

Out

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : ["us"]}

Page 53: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

53

What about Documents Without a Specific Targeting?

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us"]} }

{ "title" :”Music World–everything on NY Music Scene“, "categories" : [”music"], "entities" : [”aerosmith", ”ny"], "geo" : [“"]}

Page 54: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

54

What about Documents Without a Specific Targeting?

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us"]} }

Document is Filtered

Out

{ "title" :”Music Around the World“, "categories" : [”music"], "entities" : [”colplay", ”muse"], "geo" : [“"]}

Page 55: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

55

Solution – Index & Query the Value "all"

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us", "all"]} }

{ "title" :”Music Around the World“, "categories" : [”music"], "entities" : [”colplay", ”muse"], "geo" : [“all"]}

Page 56: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

56

{ "query": { "filtered": { "query": { "terms": { … } }, "filter": { "terms" : { "geo" : ["us", "all"]} }

Solution – Index & Query the Value "all"

Document Passes Filter

{ "title" :”Music Around the World“, "categories" : [”music"], "entities" : [”colplay", ”muse"], "geo" : [“all"]}

Page 57: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

57

Adding Domain Specific Functionality to Elasticsearch

Page 58: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

58

Indexing Marketer Cost Per Click Without IndexingCPC values change rapidly

Limitation: you cannot update a document in Elasticseach

Requirement: to keep up with throughput index should be immutable

Solution: store & update CPCs in a separate off-heap storage

Page 59: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

59

Writing a Custom Scoring Function

Combining high-granularity behavioral signals

Applying supervised learning models to compute scores

Use dynamic scripting (e.g Groovy)

OR

Use native Java via Elsaticseach plugins mechanism

Page 60: Elasticseach in Outbrain Recommender System - Looking at Content Recommendations through a Search Lens

Thank [email protected]