초보자를 위한 분산 캐시 이야기

초보자를 위한 분산 캐시 이야기

[email protected] http://charsyam.wordpress.com

잉여서버개발자 In NHN

관심분야!!!

Cloud, Big Data

특기!!!

발표 날로 먹기!!!

What is Cache?

Cache is a component

that transparently stores data so

that future requests for that

data can be served faster.

In Wikipedia

Cache 는 나중에 요청올 결과를 미리 저장해두었다가 빠르게 서비스해 주는 것

Lots of Data

Cache

Lots of Data

CPU Cache

Browser Cache

Why is Cache?

Use Case: Login

Use Case: Login

Common Case

Use Case: Login

Common Case Read From DB

Select * from Users where id=‘charsyam’;

일반적인 DB구성

Master

Slave

REPLICATION/FailOver

일반적인 DB구성

Master

Slave

REPLICATION/FailOver

모든 Traffic은 Master 가 처리 Slave는 장애 대비

이러니깐 부하가!!!

선택의 기로!!!

Scale UP Vs

Scale OUT

Scale UP

초당 1000 TPS

초당 3000 TPS

3배 처리 가능한 서버를 투입

Scale OUT

초당 1000 TPS

초당 2000 TPS

초당 3000 TPS

Scale Out을 선택!!!

Scale Out을 선택!!! 돈만 많으면 Scale Up도 좋습니다.

Request 분석

읽기 70%, 쓰기 30%

분석 결론

읽기 70%, 읽기를 분산하자!!!

One Write Master

+ Multi Read Slave

Client

Master

Slave

ONLY WRITE

Slave Slave

Only READ

REPLICATION

Eventual Consistency

Ex) Replication Master

Slave

REPLICATION

Slave에 부하가 있거나, 다른 이유로, 복제가 느려질 수 있다. 그러나 언젠가는 같아진다.

Login 정보 중에 잠시라도 서로 다르면 안되는 경우는?

Login 정보 중에 잠시라도 서로 다르면 안되는 경우는? => 다시 처음으로!!!

초반에는 행복했습니다.

Client

Master

Slave Slave Slave

REPLICATION

Slave Slave

부하가 더 커지니!!!

Client

Master

Slave

REPLICATION

Slave Slave Slave Slave Slave

Slave Slave Slave Slave Slave Slave

성능 향상이 미비함!!! WHY?

머신의 I/O는 Zero섬

Partitioning

Client

PART 1

Web Server

DBMS

PART 2

Web Server

DBMS

Scalable Partitioning

Paritioning 성능

Paritioning 성능 관리이슈

Paritioning 성능 관리이슈 비용

간단한 해결책

DB 서버 Disk는 SSD 메모리도 데이터보다 많이

DB 서버 Disk는 SSD 메모리도 데이터보다 많이

=> 돈돈돈!!!

Why is Cache?

Use Case: Login

Use Case: Login

Read From Cache

Get charsyam

Use Case: Login

Apply Cache For Read

General Cache Layer

Cache

DBMS

Storage Layer

Application Server

READ

WRITE

WRITE UPDATE

Type 1: 1 Key – N Items

KEY

Profile:User_ID

Value Profile - LastLoginTime - UserName - Host - Name

Type 2: 1 Key – 1 Item

KEY name:User_ID

Value

UserName

LastLoginTime:User_ID LastLoginTime

Pros And Cons Type 1: 1 Key – N Items

Pros: Just 1 get.

Pros And Cons

Type 1: 1 Key – N Items

Pros: Just 1 get.

Cons: if 1 item is changed. Need Cache Update And Race Condition

Pros And Cons


Pros: if 1 item is changed, just change that item.

Pros And Cons


Pros: if 1 item is changed, just change that item. Cons: Some Items can be removed

변화하지 않는 데이터

변화하는 데이터

변화하지 않는 데이터

변화하는 데이터

Divide

Don’t Try Update After didn’t Read DB




EVENT: Update Only LastLoginTime




Read Cache: User Profile





Update Data & DB

Just Save Cache





Update Data & DB

Just Save Cache

It Makes Race Condition

Need Global Lock

Need Global Lock

=> 오늘은 PASS!!!

How to Test

Using Real Data

In 25 million User ID

100,000 Request

Reproduct From Log

Test Result

136 vs 1613 seconds

Memcache VS Mysql

Result of Applying Cache

Select Query

CPU utility

균등한 속도!!!

부하 감소!!!

What is Hard?

Key and Value – SYNC Easy

1. Update To DB

2. Fail To Transaction

Fail!!!

Key and Value – SYNC HARD

1. Update To DB

2. Update to Cache Fail!!!

Key and Value – How

1. Update To DB

2. Update to Cache Fail!!!

RETRY BATCH DELETE

Data! The most important thing

If data is not important

Cache Updating is not important

Login Count

Last Login Time

ETC

If data is important

Cache Updating is important

Server Address

Data Path

ETC

RETRY! Retry, Retry, Retry Solve Over 9x%. Or Delete Cache!!!

Batch! Queuing Service

Error Log Queue

Error Log

Error Log

Error Log

Error Log

Batch Processor

Cache Server

Error Log Queue

Error Log

Error Log

Error Log

Error Log

Batch Processor

Cache Server

UPDATE

Caches

Memcache

Redis

Memcache

Atomic Operation

Memcache

Atomic Operation

Key:Value

Memcache

Atomic Operation

Key:Value Single Thread

Memcache

Over 100,000 TPS Processing

Memcache

Max size of item 1 MB

Memcache

LRU, ExpireTime

Redis

Key:Value

Redis

ExpireTime Replication Snapshot

Redis

Key:Value

Collection

List Sorted Set Hash

주의 사항

Item Size

1~XXX KB, Item 사이즈는 적을수록 좋다.

Cache In

Global Service

Memcached In Facebook • Facebook and Google and Many Companies

• Facebook – 하루 Login 5억명(한달에 8억명)(최신, 밑에는 작년 2010/04자료)

– 활성 사용자 7,000만

– 사용자 증가 비율 4일에 100만명

– Web 서버 10,000 대, Web Request 초당 2000만번

– Memcached 서버 805대 -> 15TB, HitRate: 95%

– Mysq server 1,800 대 Master/Slave(각각, 900대)

• Mem: 25TB, SQL Query 초당 50만번

Cache In Twitter(Old) API WEB

DB DB

Page Cache

Cache In Twitter(NEW) API WEB

DB DB DB

Page Cache

Fragment Cache

Row Cache

Vector Cache

Redis In Wonga

Using Redis for DataStore Write/Read

Redis In Wonga

Flash Client Ruby Backend

NoCache In Wonga

1 million daily Users

200 million daily HTTP Requests

NoCache In Wonga

1 million daily Users

200 million daily HTTP Requests

100,000 DB Operation Per Sec

40,000 DB update Per Sec

NoCache In Wonga

First Scale Out

First Setting – 3 Month

More Traffic

MySQL hiccups – DB Problems Began

Analysis About DBMS

ActiveRecord’s status Check caused 20% extra DB

60% of All Updates were done on ‘tiles’ table

Applying Sharding 2xD

Result of Applying Sharding 2xD

Doubling MySQL

Result of Doubling MySQL

50,000 TPS on EC2

Wonga Choose Redis! But some failure

=> 오늘은 PASS!!!

Result!!!

Consistent Hashing

Proxy

Server

Server

Server

Server

Server

User Request

K = 10000 N = 5

Origin

Proxy

Server

Server

Server

Server

Server

User Request

K = 10000 N = 4

FAIL : Redistribution about 2000 Users

Proxy

Server

Server

Server

Server

Server

User Request

K = 10000 N = 5

RECOVER: Redistribution about 2500 Users

A

Add A,B,C Server

A

B

Add A,B,C Server

A

B

C

Add A,B,C Server

A

B

C

1

Add Item 1

A

B

C

1

2

Add Item 2

A

B

C

1

2

3

4

5

Add Item 3,4,5

A

B

C 2

3

4

5

Fail!! B Server

A

B

C

1

2

3

4

5

Add Item 1 Again -> Allocated C Server

A

B

C

1

2

3

4

5

1

Recover B Server -> Add Item 1

Thank You!

A

B

C

Real Implementation

A+1

A+2

A+3 B+1

B+2

C+1

C+2

A+4

C+3

B+3

초보자를 위한 분산 캐시 이야기

Technology