gaming analytics on gcp

Post on 19-Jan-2017

118 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Creating a Gaming Analytics Platform최명근, Cloud Platform Sales Engineer

Confidential & ProprietaryGoogle Cloud Platform 2

Free-to-play mobile gaming is deliveredas a service.

Player engagement is key.

Confidential & ProprietaryGoogle Cloud Platform 3

The goal is to have aunified view of the player.

Confidential & ProprietaryGoogle Cloud Platform 4

Diverse Data Sources

Confidential & ProprietaryGoogle Cloud Platform 5

Diverse Data Sources

Data from user acquisition campaigns

Data from Google Play and App Store

Turnkey gaming metrics (e.g. player churn and spend predictions from Play Games Services)

Confidential & ProprietaryGoogle Cloud Platform 6

Diverse Data Sources

User behavior datafrom your website and mobile apps

Confidential & ProprietaryGoogle Cloud Platform 7

Custom game events

Custom logs

Custom player telemetryspecific to your games

Diverse Data Sources

Confidential & ProprietaryGoogle Cloud Platform 8

Continuum of Gaming Analytics

Standard metrics:● DAU, MAU, ARPPU● Player Progression● Feature Engagement● Spend● Retention / Churn● Daily revenue targets● Fraud and cheating

Key indicators specific to your game:● Activity in communities, joining

guilds, # of friends in-game ● Reached meaningful milestone

or achievement● Time to first meaningful transaction● Player response to specific A/B tests

Turnkey Custom

● How many players made it to stage 12?

● What path did they take through the stage?

● Health and other key stats at this point in time?

● Of the players who took the same route where a certain condition was true, how many made an in-app purchase?

● What are the characteristics of the player segment who didn’t make the purchase vs. those who did?

● Why was this custom event so successful in driving in-app purchases compared to others?

Ask custom questions

Confidential & ProprietaryGoogle Cloud Platform 9

秘密 / 占有情報Google Cloud Platform 10

3 Things to Remember

秘密 / 占有情報Google Cloud Platform 10

Speed up from Batch to Real-Time

Speed up Development Time

Speed up Batch Processing1

3

2

Confidential & ProprietaryGoogle Cloud Platform 11Confidential & ProprietaryGoogle Cloud Platform 11

1 - Speed up Batch Processing

秘密 / 占有情報Google Cloud Platform 12

"Getting Started" Pattern

秘密 / 占有情報Google Cloud Platform 12

Confidential & ProprietaryGoogle Cloud Platform 13

Demo

Confidential & ProprietaryGoogle Cloud Platform 14

Confidential & ProprietaryGoogle Cloud Platform 15

Confidential & ProprietaryGoogle Cloud Platform 16

Some of DeNA's Hadoop+Hive woes:

● Many bottlenecks & failure points● 3 hour data ingestion lag● Too many analysts at peak time● Slow queries● ...

Google confidential │ Do not distribute

BigQuery is different because

Google confidential │ Do not distribute

BigQuery is different because

Analyze TBs /secs

Google confidential │ Do not distribute

BigQuery is different because

No servers

Google confidential │ Do not distribute

BigQuery is different because

Terabit network

Google confidential │ Do not distribute

BigQuery is different because

No pre-planning

Google confidential │ Do not distribute

BigQuery is different because

No indexes

Google confidential │ Do not distribute

BigQuery is different because

Large joins, any key

Google confidential │ Do not distribute

BigQuery is different because

Always On

Google confidential │ Do not distribute

BigQuery is different because

Planned downtime?

Google confidential │ Do not distribute

BigQuery is different because

Unlimited users

Google confidential │ Do not distribute

BigQuery is different because

Stream in <100k rps

Google confidential │ Do not distribute

BigQuery is different because

Secure, but shares

Google confidential │ Do not distribute

BigQuery is different because

Free monthly quota( 1TB )

Confidential & ProprietaryGoogle Cloud Platform 30

Confidential & ProprietaryGoogle Cloud Platform 31

If that's the "Getting Started" pattern...

...what's NEXT?

秘密 / 占有情報Google Cloud Platform 32

Cloud Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 32

Confidential & ProprietaryGoogle Cloud Platform 33

Let's Dive Deeper:

Cloud Dataflow (Apache Beam)

秘密 / 占有情報Google Cloud Platform 34

Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 34

秘密 / 占有情報Google Cloud Platform 35秘密 / 占有情報Google Cloud Platform 35

秘密 / 占有情報Google Cloud Platform 36

Dataflow (Apache Beam) Pattern

秘密 / 占有情報Google Cloud Platform 36

Confidential & ProprietaryGoogle Cloud Platform 37

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Pipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

Dataflow goodies

Confidential & ProprietaryGoogle Cloud Platform 38

Dataflow goodies

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Confidential & ProprietaryGoogle Cloud Platform 39

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

800 RPS 1200 RPS 5000 RPS 50 RPS

*means 100% cluster utilization by definition

Confidential & ProprietaryGoogle Cloud Platform 40

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodies

Confidential & ProprietaryGoogle Cloud Platform 41

Autoscaling mid-job

Fully managed - No-Ops

Intuitive Data Processing Framework

Batch and Stream Processing in one

Liquid sharding mid-job

1

2

3

4

5

Dataflow goodiesPipeline p = Pipeline.create();

p.begin()

.apply(TextIO.Read.from(“gs://…”))

.apply(ParDo.of(new ExtractTags())

.apply(Count.create())

.apply(ParDo.of(new ExpandPrefixes())

.apply(Top.largestPerKey(3))

.apply(TextIO.Write.to(“gs://…”));

p.run();

.apply(PubsubIO.Read.from(“input_topic”))

.apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))

.apply(PubsubIO.Write.to(“output_topic”));

Confidential & ProprietaryGoogle Cloud Platform 42Confidential & ProprietaryGoogle Cloud Platform 42

2 - Speed up from Batch to Real-time

{"eventTime":"2016-04-29T05:22:39.201477414Z","userId":"user529287@example.com","sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0","sessionStartTime":"2016-04-29T05:20:44.201477414Z","eventId":"playermissednpc","currentQuest":156,"npcId":"boss156","battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a","playerAttackPoints":15,"playerHitPoints":2,"playerMaxHitPoints":15,"playerArmorClass":15,"npcAttackPoints":15,"npcHitPoints":8,"npcMaxHitPoints":15,"npcArmorClass":15,"attackRoll":12

}

{"eventTime":"2016-04-29T05:22:39.201477414Z","userId":"user529287@example.com","sessionId":"413b4d97-634f-9463-3bc7-9f9f45ecdcb0","sessionStartTime":"2016-04-29T05:20:44.201477414Z","eventId":"playermissednpc","currentQuest":156,"npcId":"boss156","battleId":"928f86db-187a-5ba4-f7cb-b6a7ded0056a","player": [

{"attackPoints":15},{"hitPoints":2},{"maxHitPoints":15},{"armorClass":15} ],

"npcAttackPoints":15,"npcHitPoints":8,"npcMaxHitPoints":15,"npcArmorClass":15,"attackRoll":12

}

Confidential & ProprietaryGoogle Cloud Platform 45

Transform and Load: Cloud Dataflow

Streaming Pipeline

BigQueryAnalytics Engine

Cloud Pub/SubAsynchronous messaging

RealTime

Events

Cloud DataflowParallel data processing

32 4

Streaming Pipeline

iOS

1

Real-time Events

{ ... "userId":"gamer@example.com", "damageRoll":13, ...}{

userId... damageRoll ...

... 13 ...gamer@example.com

... 8 ...player@example.com{ ... "userId":"player@example.com", "damageRoll":8, ...}

秘密 / 占有情報Google Cloud Platform 46

Reads game data published in near real-time, and uses that data to perform two separate processing tasks:

● Calculates the total score for every unique user and publishes speculative results for every ten minutes of processing time.

● Calculates the team scores for each hour that the pipeline runs using fixed-time windowing..

● In addition, the team score calculation uses Dataflow's trigger mechanisms to provide speculative results for each hour (which update every five minutes until the hour is up), and to also capture any late data and add it to the specific hour-long window to which it belongs.

Leaderboard Example

秘密 / 占有情報Google Cloud Platform 46

秘密 / 占有情報Google Cloud Platform 48

http://goo.gl/vz1Cj5● UserScore: Basic Score Processing in Batch

● HourlyTeamScore: Advanced Processing in Batch with Windowing

● LeaderBoard: Streaming Processing with Real-Time Game Data

● GameStats: Abuse Detection and Usage Analysis

Cloud Dataflow and Spark examples

Sample Code on Github

秘密 / 占有情報Google Cloud Platform 48

Confidential & ProprietaryGoogle Cloud Platform 49

US Mobile Game Company goes Real-time Streaming

Streaming Pipeline

BigQueryAnalytics Engine

Cloud Pub/SubAsynchronous messaging

RealTime

Events

Cloud DataflowParallel data processing

32 4

Streaming Pipeline

iOS

1

Real-time Events

Confidential & ProprietaryGoogle Cloud Platform 50

Thing 3: Speed up Development Time

Building what’s next 51

Time to Understanding

Typical Big Data Processing

Programming

Resource provisioning

Performance tuning

Monitoring

ReliabilityDeployment & configuration

Handling growing scale

Utilization improvements

Building what’s next 52

Time to Understanding

Big Data with Google:Focus on insight,not infrastructure.

Programming

Confidential & ProprietaryGoogle Cloud Platform 53

A ample data streaming logic (Dataflow vs Spark)

Confidential & Proprietary 54Google Cloud Platform

speed 10B logs TBs of info 10x faster

Provisions new services in seconds instead

of days

Google App Engine syncs with BigQuery to automatically

store tens of billions of application logs so TabTale

can analyze issues on a moment's notice

Run queries on terabytes of information

in a few seconds

Can now deliver new backend features 10 times faster

without dealing with infrastructure maintenance

“Our ability to provision new services in seconds saves us a lot of time, since it used to take days. The gaming industry is characterized by short-term projects, so it’s important for us to have a backend that is flexible and works fast.”

Confidential & ProprietaryGoogle Cloud Platform 55

7

Architecture breakdown: Batch

Architecture breakdown: Stream

Confidential & ProprietaryGoogle Cloud Platform 56

7

GCP gaming telemetry reference architecture

Confidential & ProprietaryGoogle Cloud Platform 57

7

Confidential & ProprietaryGoogle Cloud Platform 58

6

7

http://goo.gl/IdYxaa

Confidential & ProprietaryGoogle Cloud Platform 59

TensorFlow open source manifestation of our ML capability

Machine Learning - TensorFlow Machine Learning - Vision API

Label / Entity Detection, Facial Detection, OCR, Logo Detection, Safe

Search

Machine Learning - Cloud Dataproc

Managed Hadoop, Hive, Spark90 secs to start cluster

Confidential & ProprietaryGoogle Cloud Platform 60

Like you, Google is committed to gaming

Use Google’s latest technologies to build, distribute, and monetize your games

Thanks!

top related