cassandra @ yahoo japan (satoshi konno, yahoo) | cassandra summit 2016
TRANSCRIPT
Cassandra @
Satoshi Konnohttp://www.cybergarage.org
• Engineering Manager of NoSQL Team @ Yahoo! Japan
• Open Source Software Developer for Virtual Reality, IoT and Cloud Computing
• Doctor's Course Student @ JAISTDéfago Lab : The φ accrual failure detector
About me
2
Agenda
• Company Profile• Summary of C* Clusters• Issues and Solutions of C*• Next Generation Infrastructures for C*
Company Profile
4
Founded : January 31, 1996Businesses : Internet Advertising
e-CommerceMembers Services, etc.
Web Services : 100+Smartphone Apps: 50+ (iOS), 50+ (Android)Employees : 5,800+ (as of June 30, 2016)Head Office : Chiyoda-ku, Tokyo, Japan
Company Profile
5
Shareholder Composition
6An independent and public company in the Japanese Market
U.S. Japan
35.5%
42.9%
Market Cap$22 billion
Market Cap$29 billion
Market Cap$60 billion
18th Largest Internet Company in market cap
7
0
100
200
300
400
500
600
bilionU.S.dollars
http://www.statista.com/statistics/277483/market-value-of-the-largest-internet-companies-worldwide/
19 years
1617
18
Revenue ¥652B, Operating Income ¥171B (FY2015)
Continued Growth Sustained
60%Consumer
32%
%
Others
8 %Marketing Solutions
Revenue Portfolio
(FY2015)
Extensive Reach to a Wide Range of Users
10
80 %
80% of all Japanese Internet users use Yahoo! JAPANNielsen NetView June 2015 : Data by Brands. Access from home and work using PCs (excl. internet applications)
Many Strong Services
11
Media
US
Search Video Answer Mail
JP
US
JP
Membership C2C Payment C2C EC B2C EC Local
Search Knowledge search MailNews
YAHUOKU!Premium Wallet Loco
Summary of C* Clusters
12
Yahoo! JAPAN Database Platforms
13
300+Systems
NoSQLTeam
100+Services
OSS Database Platforms
14
300+Systems
180Systems
MySQL 630DBs
100Systems
Cassandra 130DBs
30
70
60
40
Yahoo Japan
NoSQLTeam
RDBTeam
Cassandra @ Yahoo! JAPAN
15
2010 2012 2014 2016
ServiceDepartments
OurTeam
0.5 0.8 1.x
0.8 1.x 2.x 3.x
NoSQLTeam
Our Cassandra Clusters
16
30Clusters
30TBUsages
1000+Nodes
300,000Read/sec
100,000Write/sec
2016
10Nodes /Cluster
160Nodes /Cluster
…1
SharedCluster
30Special
Clusters
30Systems
50Systems
3DCs
Our Use Case Summary on Cassandra
17
100Systems
20
DatabaseCaching
10
Advertising Services
40
User Databases
50
Service Databases
Browsing History
Impression Data
・・・・
Meta Data
Aggregated Data
・・・・
Generated Data
Session Data
Meta Data
Aggregated Data
・・・・
Generated Data
Recommendation
Demographic Data
Life Log
・・・・
Preference Data
Behavior History
Our Issues and Solutions
18
ISSUE #1 : C10k Problem – C* Proxy
19
PC + Tablet3.36B PV
Smart Device3.45B PV
6.8 Billion PV / month
ISSUE #1 : C10k Problem – C* Proxy
20
Yahoo Japan Services
..........
10 〜 200 Front-end Servers / Service
PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat) =51,200 connections / node
21PHOTO:AFLO
200 Front-end Servers
128 processes
51,200 connections !
ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat) =51,200 connections / node
22PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy• PROBLEM : 200 front-end servers * 128 processes
* 2 (C* request + C* heart beat) =51,200 connections / node
23
Process down
PHOTO:AFLO
ISSUE #1 : C10k Problem – C* Proxy• SOLUTION : 200 front-end servers * 128 processes
* 1 proxy * 2 (C* request + C* heart beat) =400 connections / node
24
200 front-end servers1 proxy
400 connections !
128 processes
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver• Heavy Services : ↑3000qps/node
= C* cluster with real servers (SSD is recommended)
• Light Services : ↓1000qps/node and ↓3GB/node= C * cluster with virtual servers on OpenStack
25
Heavy Service Light Service
CPU = GoodvCPU = Cheap
ISSUE #2 : Boostrap Problem - Driver• PROBLEM : All processes in each front-end server
tries to connect a new C* node which is added into the cluster at the same time ...
26
..........
! ! !
! ! !
vCPU = Cheap
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver• PROBLEM : The authentication of C* based on
BCrypt is heavy processing for the vCPU nodes.
27
..........
!
vCPU : Authentication (BCrypt) is heavy !
! !
! ! !
PHOTO:AFLO
ISSUE #2 : Boostrap Problem - Driver• PROBLEM : Most processes can not connect to C*
clusters on OpenStack due to the authentication processing, and the processes will timeout and repeat to connect without waiting endlessly …
28
All vCPU Usages = 100% !
PHOTO:AFLO
vCPU : Authentication (BCrypt) is heavy !
Timeout ! Retry !
ISSUE #2 : Boostrap Problem - Driver• SOLUTION : Improving the C* drivers not to connect
simultaneously when the connection is failed.
29
..........
!! !
! ! !
PHOTO:AFLO
ISSUE #3 : Multi-tenancy – Slow Query• Small Services : (↓500qps and ↓10GB) / keyspace
= Shared C* cluster with real servers
30
SharedCluster
50Services
ISSUE #3 : Multi-tenancy – Slow Query• PROBLEM : Couldn’t find the causal service of the
high loading queries in the multi-tenancy cluster.
31
SharedCluster Which
services ?
QUERY
QUERY
PHOTO:AFLO
ISSUE #3 : Multi-tenancy – Slow Query• SOLUTION : CASSANDRA-12403 - Slow query
detecting
32
SharedCluster
Service Remove
SpecialCluster
QUERY
PHOTO:AFLO
Slow Query !
ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : Our C* clusters are build with other
services in a same rack or under a same core switch.
33PHOTO:AFLO
ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : C* Streaming occurs when the node is
added or remove by the our operation or the failure detection.
34
Streaming
PHOTO:AFLO
ISSUE #4 : Multi-racking – Inbound Params• PROBLEM : The streaming of C* rises a heavy traffic,
and it troubles the other services.
35
Streaming
Streaming
Streaming
Stop C* streaming !
PHOTO:AFLO
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_outbound
ISSUE #4 : Multi-racking – Inbound Params• SOLUTION : CASSANDRA-11303 - New inbound
throughput parameters for streaming
36
Streaming
Streaming
Streaming
PHOTO:AFLO
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_outbound
stream_throughput_inbound
stream_throughput_inbound
stream_throughput_inbound
Next Generation Infrastructures for C*
37
• PURPOSE : To abstract our data center resources using OpenStack.
Apps
Platforms
Infrastructures
APIAPI
API API API API
OpenStack @ Yahoo! JAPAN
38
50,000+instances
Trial #1 : Special Hypervisor for C*• PROBLEM : Our hypervisors of OpenStack has C*
and other service VMs.
39
NoisyNeighbours
Trial #1 : Special Hypervisor for C*• SOLUTION : Trying to offer the special hypervisors
which runs only C* VMs.
40
vCPU : 8+, Mem : 16GiB+SSD : 100GiB+
OptimalFlavors for C*
10Gbps x 2
TRIAL#2 : Bare Metal Clusters for C*• PROBLEM : vCPU of OpenStack is cheap to run a C*
node in our special service environment such as the many connections.
41
vCPU : Authentication (BCrypt) is heavy !
TRIAL #2 : Bare Metal Clusters for C*• SOLUTION : Trying to offer the special bare metal
clusters which runs only C* using OpenStack Ironic.
42
IronicXeon D-1541 2.1GHz (1CPU)32GBMEM / SATA SSD 400GB
10Gbps x 2