cassandra @ yahoo japan (satoshi konno, yahoo) | cassandra summit 2016

43
Cassandra @

Upload: datastax

Post on 16-Apr-2017

435 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Cassandra @

Page 2: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Satoshi Konnohttp://www.cybergarage.org

• Engineering Manager of NoSQL Team @ Yahoo! Japan

• Open Source Software Developer for Virtual Reality, IoT and Cloud Computing

• Doctor's Course Student @ JAISTDéfago Lab : The φ accrual failure detector

About me

2

Page 3: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Agenda

• Company Profile• Summary of C* Clusters• Issues and Solutions of C*• Next Generation Infrastructures for C*

Page 4: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Company Profile

4

Page 5: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Founded : January 31, 1996Businesses : Internet Advertising

e-CommerceMembers Services, etc.

Web Services : 100+Smartphone Apps: 50+ (iOS), 50+ (Android)Employees : 5,800+ (as of June 30, 2016)Head Office : Chiyoda-ku, Tokyo, Japan

Company Profile

5

Page 6: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Shareholder Composition

6An independent and public company in the Japanese Market

U.S. Japan

35.5%

42.9%

Market Cap$22 billion

Market Cap$29 billion

Market Cap$60 billion

Page 7: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

18th Largest Internet Company in market cap

7

0

100

200

300

400

500

600

bilionU.S.dollars

http://www.statista.com/statistics/277483/market-value-of-the-largest-internet-companies-worldwide/

Page 8: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

19 years

1617

18

Revenue ¥652B, Operating Income ¥171B (FY2015)

Continued Growth Sustained

Page 9: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

60%Consumer

32%

%

Others

8 %Marketing Solutions

Revenue Portfolio

(FY2015)

Page 10: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Extensive Reach to a Wide Range of Users

10

80 %

80% of all Japanese Internet users use Yahoo! JAPANNielsen NetView June 2015 : Data by Brands. Access from home and work using PCs (excl. internet applications)

Page 11: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Many Strong Services

11

Media

US

Search Video Answer Mail

JP

US

JP

Membership C2C Payment C2C EC B2C EC Local

Search Knowledge search MailNews

YAHUOKU!Premium Wallet Loco

Page 12: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Summary of C* Clusters

12

Page 13: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Yahoo! JAPAN Database Platforms

13

300+Systems

NoSQLTeam

100+Services

Page 14: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

OSS Database Platforms

14

300+Systems

180Systems

MySQL 630DBs

100Systems

Cassandra 130DBs

30

70

60

40

Yahoo Japan

NoSQLTeam

RDBTeam

Page 15: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Cassandra @ Yahoo! JAPAN

15

2010 2012 2014 2016

ServiceDepartments

OurTeam

0.5 0.8 1.x

0.8 1.x 2.x 3.x

NoSQLTeam

Page 16: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Our Cassandra Clusters

16

30Clusters

30TBUsages

1000+Nodes

300,000Read/sec

100,000Write/sec

2016

10Nodes /Cluster

160Nodes /Cluster

…1

SharedCluster

30Special

Clusters

30Systems

50Systems

3DCs

Page 17: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Our Use Case Summary on Cassandra

17

100Systems

20

DatabaseCaching

10

Advertising Services

40

User Databases

50

Service Databases

Browsing History

Impression Data

・・・・

Meta Data

Aggregated Data

・・・・

Generated Data

Session Data

Meta Data

Aggregated Data

・・・・

Generated Data

Recommendation

Demographic Data

Life Log

・・・・

Preference Data

Behavior History

Page 18: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Our Issues and Solutions

18

Page 19: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy

19

PC + Tablet3.36B PV

Smart Device3.45B PV

6.8 Billion PV / month

Page 20: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy

20

Yahoo Japan Services

..........

10 〜 200 Front-end Servers / Service

PHOTO:AFLO

Page 21: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes

* 2 (C* request + C* heart beat) =51,200 connections / node

21PHOTO:AFLO

200 Front-end Servers

128 processes

51,200 connections !

Page 22: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes

* 2 (C* request + C* heart beat) =51,200 connections / node

22PHOTO:AFLO

Page 23: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes

* 2 (C* request + C* heart beat) =51,200 connections / node

23

Process down

PHOTO:AFLO

Page 24: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #1 : C10k Problem – C* Proxy• SOLUTION : 200 front-end servers * 128 processes

* 1 proxy * 2 (C* request + C* heart beat) =400 connections / node

24

200 front-end servers1 proxy

400 connections !

128 processes

PHOTO:AFLO

Page 25: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #2 : Boostrap Problem - Driver• Heavy Services : ↑3000qps/node

= C* cluster with real servers (SSD is recommended)

• Light Services : ↓1000qps/node and ↓3GB/node= C * cluster with virtual servers on OpenStack

25

Heavy Service Light Service

CPU = GoodvCPU = Cheap

Page 26: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #2 : Boostrap Problem - Driver• PROBLEM : All processes in each front-end server

tries to connect a new C* node which is added into the cluster at the same time ...

26

..........

! ! !

! ! !

vCPU = Cheap

PHOTO:AFLO

Page 27: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #2 : Boostrap Problem - Driver• PROBLEM : The authentication of C* based on

BCrypt is heavy processing for the vCPU nodes.

27

..........

!

vCPU : Authentication (BCrypt) is heavy !

! !

! ! !

PHOTO:AFLO

Page 28: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #2 : Boostrap Problem - Driver• PROBLEM : Most processes can not connect to C*

clusters on OpenStack due to the authentication processing, and the processes will timeout and repeat to connect without waiting endlessly …

28

All vCPU Usages = 100% !

PHOTO:AFLO

vCPU : Authentication (BCrypt) is heavy !

Timeout ! Retry !

Page 29: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #2 : Boostrap Problem - Driver• SOLUTION : Improving the C* drivers not to connect

simultaneously when the connection is failed.

29

..........

!! !

! ! !

PHOTO:AFLO

Page 30: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #3 : Multi-tenancy – Slow Query• Small Services : (↓500qps and ↓10GB) / keyspace

= Shared C* cluster with real servers

30

SharedCluster

50Services

Page 31: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #3 : Multi-tenancy – Slow Query• PROBLEM : Couldn’t find the causal service of the

high loading queries in the multi-tenancy cluster.

31

SharedCluster Which

services ?

QUERY

QUERY

PHOTO:AFLO

Page 32: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #3 : Multi-tenancy – Slow Query• SOLUTION : CASSANDRA-12403 - Slow query

detecting

32

SharedCluster

Service Remove

SpecialCluster

QUERY

PHOTO:AFLO

Slow Query !

Page 33: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : Our C* clusters are build with other

services in a same rack or under a same core switch.

33PHOTO:AFLO

Page 34: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : C* Streaming occurs when the node is

added or remove by the our operation or the failure detection.

34

Streaming

PHOTO:AFLO

Page 35: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : The streaming of C* rises a heavy traffic,

and it troubles the other services.

35

Streaming

Streaming

Streaming

Stop C* streaming !

PHOTO:AFLO

stream_throughput_outbound

stream_throughput_outbound

stream_throughput_outbound

Page 36: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

ISSUE #4 : Multi-racking – Inbound Params• SOLUTION : CASSANDRA-11303 - New inbound

throughput parameters for streaming

36

Streaming

Streaming

Streaming

PHOTO:AFLO

stream_throughput_outbound

stream_throughput_outbound

stream_throughput_outbound

stream_throughput_inbound

stream_throughput_inbound

stream_throughput_inbound

Page 37: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Next Generation Infrastructures for C*

37

Page 38: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

• PURPOSE : To abstract our data center resources using OpenStack.

Apps

Platforms

Infrastructures

APIAPI

API API API API

OpenStack @ Yahoo! JAPAN

38

50,000+instances

Page 39: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Trial #1 : Special Hypervisor for C*• PROBLEM : Our hypervisors of OpenStack has C*

and other service VMs.

39

NoisyNeighbours

Page 40: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

Trial #1 : Special Hypervisor for C*• SOLUTION : Trying to offer the special hypervisors

which runs only C* VMs.

40

vCPU : 8+, Mem : 16GiB+SSD : 100GiB+

OptimalFlavors for C*

10Gbps x 2

Page 41: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

TRIAL#2 : Bare Metal Clusters for C*• PROBLEM : vCPU of OpenStack is cheap to run a C*

node in our special service environment such as the many connections.

41

vCPU : Authentication (BCrypt) is heavy !

Page 42: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016

TRIAL #2 : Bare Metal Clusters for C*• SOLUTION : Trying to offer the special bare metal

clusters which runs only C* using OpenStack Ironic.

42

IronicXeon D-1541 2.1GHz (1CPU)32GBMEM / SATA SSD 400GB

10Gbps x 2

Page 43: Cassandra @ Yahoo Japan (Satoshi Konno, Yahoo) | Cassandra Summit 2016