b 2 line game cloud - our personal ec2

Post on 16-Apr-2017

5.395 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Currently on the Cloud

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

Our situation in 2014

1st gen “HSP” – (2014) from cybercafe/漫画木さ/PC방 platform

2nd gen ECDH-based key exchange

Platform Billing/AAA/Monitoring, etc. & Game servers

“LGC” – (cloud release) 3rd gen: “Trident” – (current)

Globalization Issues abroad

Loading… Loading… L o ad i n g … L o a d i n g …

Fail !

Process

Dev/QA/Sandbox/REAL…

Get VM

Get L4 binding

ACLs/storage, etc.

From hours to days…

Thus, the Game Cloud project began…

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

Global

In our case GSLB (Global Server Load Balancing)

HAProxy instead of hardware L4 Multi-team effort

client, server, cloud We get

more flexibility less latency

LINE Global POP

New York Tokyo

Seoul

HK

Singapole

Beijing Frankfurt

a a

Network layer control High latency Fit for cloud

Global Testing in Thailand

TH KR

GSLB

TH SG KR

HAProxy

500

1000

1500

2 12 20 6 16

Process

Dev

Ops Government

The structure of your organization affects the structure of your software. And vice-versa!

DevOps Small Startup CHOOSE

Process

Progressive / Easy, simple

Requirements for our new platform : We have many third parties and technology stacks involved...

Etc.

Process For ourselves

simple reliable future proof

Process Why not…

Docker Swarm or CoreOS

Mesos Kubernetes

Process

KEEP THINGS SIMPLE AND RELIABLE!

For distributed systems, minimize coordination

A good paper: https://blog.acolyer.org/2016/01/19/dcft/

• Polling • 1way dataflow • Idempotency • Commutativity • l imited trust

LGC Story

Games planned for release were suddenly canceled but we needed to show results!

Strong “sales” efforts to release other games on the LGC platform

Putting Out Fires

The release was a success,followed by a quick scaling-up, and then our first fires…

TECHNICAL

Riak fire : the system works with Riak down

OE fire: the system works with OE down

Hardware and conf fires (TDI! Soon to come!)

Full container reboot improved our design through limited trust

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

Domain HAP

Launch service

Configure/ load balance

Expose ports

Bind URL

In one click!

Monitoring

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

Gearbox Auto Scaling System

High availability Low cost

Why Do We Need It?

How Does It Work?

How Did We Build it?

Data Collector

Monitoring API

Predicator

Metrics

Raw Metr ics

Scaler

States

Game Cloud API

Excute Scaling Gearbox

Challenges

Complex query Plenty of metrics

records millions of records

per day

Scalability of the auto-scaling system itself

Solutions Data Collector

Monitor ing API

Predicator

Metr ics Metr ics

Scaler

Sta tes

Game Cloud API

Excu te Sca l ing

Gearbox

Mod

ule

Sto

rage

ElasticSearch Admin Site

St ra tegy

Sca ler Log

Metr ics 1 .St ra t egy 2 .Metr ics

{ }

Pred ica to r Log

1 .St ra t egy 2 .Sca ler Log

Knife Admin Site

Deploying a New Service

Upgrade

Configuring the Auto-Scaling Policy

Back to Jojo What’s coming next!

1 Our situation in 2014 2 How we improved 3 Sweet things 4 More sweet things 5 Future

Future

QUIC

SDN ACL, IP by container, VLAN etc.

Cloud storage

TDI

Distributed GC – link paper

DCTCP

Image GC

Future

UX

Helpers/presets

Speed

Doc/tests/guides…

Reliability

QUIC Quick UDP Internet Connections

Cloud Storage

SDN- Software-Defined Networking

Container specific IP ACLs VLANs

TDI- Test-Driven Infrastructure

hardware OS configuration

images backup/ restore

Automated testing for

Etc.

firmware, version, etc.

Distributed GC

Max

Avg.GC pause

http://arxiv.org/pdf/1504.02578.pdf

Median

Std.Dev.

Mean

7.847

0.0

2.296

0.579

2.312

GC off

7.743

12.243

2.294

0.582

2.311

Blade

164.206

12.339

2.297

3.395

2.403

GC on

Added in Linux 3.18 https://kernelnewbies.org/

Linux_3.18 http://simula.stanford.edu/~alizade/Site/DCTCP.html

DCTCP- Data Center TCP

- high burst tolerance - low latency - high throughput

http://simula.stanford.edu/~alizade/Site/DCTCP.html

Distributed GC Because we generate tons of Docker images

And more and more…

• AP • Optional CP • Index/search • CRDT • Multiple backends • User ACL support

RIAK/Choose a Safe and Simple Friend Make a deliberate choice of consistency model

SQL

NoSQL

But actually…

With the authorization of Kingsbury Kyle (Aphyr)

DataScript / Maintain Queries

top related