gaming analytics on gcp

61
Creating a Gaming Analytics Platform 최명근, Cloud Platform Sales Engineer

Upload: myunggeun-choi

Post on 19-Jan-2017

118 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Gaming analytics on gcp

Creating a Gaming Analytics Platform최명근, Cloud Platform Sales Engineer

Page 2: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 2

Free-to-play mobile gaming is deliveredas a service.

Player engagement is key.

Page 3: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 3

The goal is to have aunified view of the player.

Page 4: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 4

Diverse Data Sources

Page 5: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 5

Diverse Data Sources

Data from user acquisition campaigns

Data from Google Play and App Store

Turnkey gaming metrics (e.g. player churn and spend predictions from Play Games Services)

Page 6: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 6

Diverse Data Sources

User behavior datafrom your website and mobile apps

Page 7: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 7

Custom game events

Custom logs

Custom player telemetryspecific to your games

Diverse Data Sources

Page 8: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 8

Continuum of Gaming Analytics

Standard metrics:● DAU, MAU, ARPPU● Player Progression● Feature Engagement● Spend● Retention / Churn● Daily revenue targets● Fraud and cheating

Key indicators specific to your game:● Activity in communities, joining

guilds, # of friends in-game ● Reached meaningful milestone

or achievement● Time to first meaningful transaction● Player response to specific A/B tests

Turnkey Custom

Page 9: Gaming analytics on gcp

● How many players made it to stage 12?

● What path did they take through the stage?

● Health and other key stats at this point in time?

● Of the players who took the same route where a certain condition was true, how many made an in-app purchase?

● What are the characteristics of the player segment who didn’t make the purchase vs. those who did?

● Why was this custom event so successful in driving in-app purchases compared to others?

Ask custom questions

Confidential & ProprietaryGoogle Cloud Platform 9

Page 10: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 10

3 Things to Remember

秘密 / 占有情報Google Cloud Platform 10

Speed up from Batch to Real-Time

Speed up Development Time

Speed up Batch Processing1

3

2

Page 11: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 11Confidential & ProprietaryGoogle Cloud Platform 11

1 - Speed up Batch Processing

Page 12: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 12

"Getting Started" Pattern

秘密 / 占有情報Google Cloud Platform 12

Page 13: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 13

Demo

Page 14: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 14

Page 15: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 15

Page 16: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 16

Some of DeNA's Hadoop+Hive woes:

● Many bottlenecks & failure points● 3 hour data ingestion lag● Too many analysts at peak time● Slow queries● ...

Page 17: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Page 18: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Analyze TBs /secs

Page 19: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

No servers

Page 20: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Terabit network

Page 21: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

No pre-planning

Page 22: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

No indexes

Page 23: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Large joins, any key

Page 24: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Always On

Page 25: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Planned downtime?

Page 26: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Unlimited users

Page 27: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Stream in <100k rps

Page 28: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Secure, but shares

Page 29: Gaming analytics on gcp

Google confidential │ Do not distribute

BigQuery is different because

Free monthly quota( 1TB )

Page 30: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 30

Page 31: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 31

If that's the "Getting Started" pattern...

...what's NEXT?

Page 32: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 32

Cloud Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 32

Page 33: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 33

Let's Dive Deeper:

Cloud Dataflow (Apache Beam)

Page 34: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 34

Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 34

Page 35: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 35秘密 / 占有情報Google Cloud Platform 35

Page 36: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 36

Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 36

Page 37: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 37

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Pipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

Dataflow goodies

Page 38: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 38

Dataflow goodies

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Page 39: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 39

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

800 RPS 1200 RPS 5000 RPS 50 RPS

*means 100% cluster utilization by definition

Page 40: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 40

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

Page 41: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 41

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodiesPipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

.apply(PubsubIO.Read.from(“input_topic”))

.apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))

.apply(PubsubIO.Write.to(“output_topic”));

Page 42: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 42Confidential & ProprietaryGoogle Cloud Platform 42

2 - Speed up from Batch to Real-time

Page 43: Gaming analytics on gcp

{"eventTime":"2016-04-29T05:22:39.201477414Z","userId":"[email protected]","sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0","sessionStartTime":"2016-04-29T05:20:44.201477414Z","eventId":"playermissednpc","currentQuest":156,"npcId":"boss156","battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a","playerAttackPoints":15,"playerHitPoints":2,"playerMaxHitPoints":15,"playerArmorClass":15,"npcAttackPoints":15,"npcHitPoints":8,"npcMaxHitPoints":15,"npcArmorClass":15,"attackRoll":12

}

Page 44: Gaming analytics on gcp

{"eventTime":"2016-04-29T05:22:39.201477414Z","userId":"[email protected]","sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0","sessionStartTime":"2016-04-29T05:20:44.201477414Z","eventId":"playermissednpc","currentQuest":156,"npcId":"boss156","battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a","player": [

{"attackPoints":15},{"hitPoints":2},{"maxHitPoints":15},{"armorClass":15} ],

"npcAttackPoints":15,"npcHitPoints":8,"npcMaxHitPoints":15,"npcArmorClass":15,"attackRoll":12

}

Page 45: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 45

Transform and Load: Cloud Dataflow

Streaming Pipeline

BigQueryAnalytics Engine

Cloud Pub/SubAsynchronous messaging

RealTime

Events

Cloud DataflowParallel data processing

32 4

Streaming Pipeline

iOS

1

Real-time Events

{ ... "userId":"[email protected]", "damageRoll":13, ...}{

userId... damageRoll ...

... 13 [email protected]

... 8 [email protected]{ ... "userId":"[email protected]", "damageRoll":8, ...}

Page 46: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 46

Reads game data published in near real-time, and uses that data to perform two separate processing tasks:

● Calculates the total score for every unique user and publishes speculative results for every ten minutes of processing time.

● Calculates the team scores for each hour that the pipeline runs using fixed-time windowing..

● In addition, the team score calculation uses Dataflow's trigger mechanisms to provide speculative results for each hour (which update every five minutes until the hour is up), and to also capture any late data and add it to the specific hour-long window to which it belongs.

Leaderboard Example

秘密 / 占有情報Google Cloud Platform 46

Page 47: Gaming analytics on gcp
Page 48: Gaming analytics on gcp

秘密 / 占有情報Google Cloud Platform 48

http://goo.gl/vz1Cj5● UserScore: Basic Score Processing in Batch

● HourlyTeamScore: Advanced Processing in Batch with Windowing

● LeaderBoard: Streaming Processing with Real-Time Game Data

● GameStats: Abuse Detection and Usage Analysis

Cloud Dataflow and Spark examples

Sample Code on Github

秘密 / 占有情報Google Cloud Platform 48

Page 49: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 49

US Mobile Game Company goes Real-time Streaming

Streaming Pipeline

BigQueryAnalytics Engine

Cloud Pub/SubAsynchronous messaging

RealTime

Events

Cloud DataflowParallel data processing

32 4

Streaming Pipeline

iOS

1

Real-time Events

Page 50: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 50

Thing 3: Speed up Development Time

Page 51: Gaming analytics on gcp

Building what’s next 51

Time to Understanding

Typical Big Data Processing

Programming

Resource provisioning

Performance tuning

Monitoring

ReliabilityDeployment & configuration

Handling growing scale

Utilization improvements

Page 52: Gaming analytics on gcp

Building what’s next 52

Time to Understanding

Big Data with Google:Focus on insight,not infrastructure.

Programming

Page 53: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 53

A ample data streaming logic (Dataflow vs Spark)

Page 54: Gaming analytics on gcp

Confidential & Proprietary 54Google Cloud Platform

speed 10B logs TBs of info 10x faster

Provisions new services in seconds instead

of days

Google App Engine syncs with BigQuery to automatically

store tens of billions of application logs so TabTale

can analyze issues on a moment's notice

Run queries on terabytes of information

in a few seconds

Can now deliver new backend features 10 times faster

without dealing with infrastructure maintenance

“Our ability to provision new services in seconds saves us a lot of time, since it used to take days. The gaming industry is characterized by short-term projects, so it’s important for us to have a backend that is flexible and works fast.”

Page 55: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 55

7

Architecture breakdown: Batch

Page 56: Gaming analytics on gcp

Architecture breakdown: Stream

Confidential & ProprietaryGoogle Cloud Platform 56

7

Page 57: Gaming analytics on gcp

GCP gaming telemetry reference architecture

Confidential & ProprietaryGoogle Cloud Platform 57

7

Page 58: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 58

6

7

http://goo.gl/IdYxaa

Page 59: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 59

TensorFlow open source manifestation of our ML capability

Machine Learning - TensorFlow Machine Learning - Vision API

Label / Entity Detection, Facial Detection, OCR, Logo Detection, Safe

Search

Machine Learning - Cloud Dataproc

Managed Hadoop, Hive, Spark90 secs to start cluster

Page 60: Gaming analytics on gcp

Confidential & ProprietaryGoogle Cloud Platform 60

Like you, Google is committed to gaming

Use Google’s latest technologies to build, distribute, and monetize your games

Page 61: Gaming analytics on gcp

Thanks!