aerospike & gce (lspe talk)

20
Aerospike & GCE [email protected]

Upload: sayyaparaju-sunil

Post on 12-Jan-2017

190 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Aerospike & GCE (LSPE Talk)

Aerospike & GCE

[email protected]

Page 2: Aerospike & GCE (LSPE Talk)

Database Landscape

Response time: Hours, MinutesTB to PBCompute Intensive

TRANSACTIONS (OLTP)

Response time: SecondsGigabytes of data

Balanced Reads/Writes

ANALYTICS (OLAP)

STRUCTURED DATA

Response time: Seconds

Terabytes of dataRead Intensive

BIG DATA ANALYTICS

Real-time TransactionsResponse time: < 10 ms1-20 TBBalanced Reads/Writes24x7x365 Availability

UNSTRUCTURED DATA

REAL-TIME BIG DATA

Page 3: Aerospike & GCE (LSPE Talk)

Minimalistic Architecture

Page 4: Aerospike & GCE (LSPE Talk)

DNA: No Hotspots

• Data distribution• Node-Node communication• Node-Client communication• Thread level• CPU level• Network level• SSD level

Page 5: Aerospike & GCE (LSPE Talk)

GCE Network

• Andromeda - SDN• VPC : Virtual Private Cloud

• No multicast though

• KVM based virtio• DPDK but no SR-IOV

Page 6: Aerospike & GCE (LSPE Talk)

The Challenge (Oct 2014)

• 1 Million write TPS• Google’s Cassandra benchmark : 300

Nodes• Median Latency = 10.3ms• 95% <23ms latency

• Aerospike : 50 Nodes• Median Latency = 7ms• 83% <16ms latency & 96.5%<32ms

Page 7: Aerospike & GCE (LSPE Talk)

CPU

• Not able to use it fully. Kind of saturates at 50-60%.• Too much of CPU going to system• Software interrupts are high (upto 30%) with high

network load

top - 08:03:52 up 25 min, 2 users, load average: 2.25, 2.12, 1.57Tasks: 84 total, 1 running, 83 sleeping, 0 stopped, 0 zombie%Cpu0 : 22.8 us, 26.7 sy, 0.0 ni, 43.9 id, 0.0 wa, 0.0 hi, 6.7 si, 0.0 st%Cpu1 : 23.5 us, 24.5 sy, 0.0 ni, 44.8 id, 0.0 wa, 0.0 hi, 7.2 si, 0.0 st%Cpu2 : 24.0 us, 24.7 sy, 0.0 ni, 45.5 id, 0.0 wa, 0.0 hi, 5.7 si, 0.0 st%Cpu3 : 24.5 us, 23.8 sy, 0.0 ni, 45.5 id, 0.0 wa, 0.0 hi, 6.3 si, 0.0 st

Page 8: Aerospike & GCE (LSPE Talk)

Signature of Network Bottleneck (CPU-based, Non-GCE)

top - 12:51:38 up 5:40, 4 users, load average: 2.86, 2.13, 1.15Tasks: 152 total, 2 running, 150 sleeping, 0 stopped, 0 zombieCpu0 : 1.9%us, 4.4%sy, 0.0%ni, 88.7%id, 0.0%wa, 2.5%hi, 0.0%si, 2.5%stCpu1 : 5.0%us, 2.5%sy, 0.0%ni, 87.6%id, 0.0%wa, 2.5%hi, 0.0%si, 2.5%stCpu2 : 2.5%us, 4.4%sy, 0.0%ni, 88.7%id, 0.0%wa, 2.5%hi, 0.0%si, 1.9%stCpu3 : 5.0%us, 2.5%sy, 0.0%ni, 87.0%id, 0.0%wa, 3.1%hi, 0.0%si, 2.5%stCpu4 : 1.3%us, 15.3%sy, 0.0%ni, 0.3%id, 0.0%wa, 1.3%hi, 81.7%si, 0.0%stCpu5 : 2.0%us, 0.7%sy, 0.0%ni, 92.8%id, 0.0%wa, 2.6%hi, 0.0%si, 2.0%stCpu6 : 2.5%us, 3.2%sy, 0.0%ni, 89.8%id, 0.0%wa, 2.5%hi, 0.0%si, 1.9%stCpu7 : 0.3%us, 14.8%sy, 0.0%ni, 0.3%id, 0.0%wa, 1.0%hi, 83.6%si, 0.0%stCpu8 : 1.2%us, 1.9%sy, 0.0%ni, 92.5%id, 0.0%wa, 1.9%hi, 0.0%si, 2.5%stCpu9 : 1.9%us, 1.3%sy, 0.0%ni, 92.9%id, 0.0%wa, 1.9%hi, 0.0%si, 1.9%stCpu10 : 1.3%us, 0.7%sy, 0.0%ni, 93.5%id, 0.0%wa, 2.0%hi, 0.0%si, 2.6%stCpu11 : 1.3%us, 1.3%sy, 0.0%ni, 94.2%id, 0.0%wa, 1.3%hi, 0.0%si, 1.9%stCpu12 : 2.8%us, 4.8%sy, 0.0%ni, 88.3%id, 0.0%wa, 2.1%hi, 0.0%si, 2.1%stCpu13 : 0.6%us, 1.3%sy, 0.0%ni, 94.3%id, 0.0%wa, 1.9%hi, 0.0%si, 1.9%stCpu14 : 1.9%us, 2.5%sy, 0.0%ni, 91.8%id, 0.0%wa, 1.3%hi, 0.0%si, 2.5%stCpu15 : 2.9%us, 3.6%sy, 0.0%ni, 89.8%id, 0.0%wa, 1.5%hi, 0.0%si, 2.2%stMem: 30620324k total, 2384264k used, 28236060k free, 27308k buffersSwap: 0k total, 0k used, 0k free, 190364k cached

Page 9: Aerospike & GCE (LSPE Talk)

Tricks

• Use standard instances• Balances network & CPU

• Use taskset and leave out 1 or 2 cores• Result

• Latencies improve• Throughput marginally improved• Less CPU going to system

Page 10: Aerospike & GCE (LSPE Talk)

Network Virtualization

Page 11: Aerospike & GCE (LSPE Talk)

DPDK

Page 12: Aerospike & GCE (LSPE Talk)

3x Improvement (Aug 15)

• 20 Aerospike nodes• 1.2M wirte TPS, 94% < 4ms latency• 4.2M read TPS, 90% < 4ms latency

• Changes• DPDK• NIC queue depth : 256->16k• ??

• Takeaway• Don’t blindly trust top, iostat. • Keep pushing till you see a bottleneck (and resolve if possible)

Page 13: Aerospike & GCE (LSPE Talk)

Live Migrations

Page 14: Aerospike & GCE (LSPE Talk)

Live Migrations : Implications

• Blackout period depends on workload• Higher the memory dirty rate, longer the blackout• Timeouts in application code will get triggered

• Effects clustering based solutions (Aerospike,…)• Missing heartbeats

• Clocktimes of VMs jump• Implications on any code tightly dependent on

clocktime

Page 15: Aerospike & GCE (LSPE Talk)

Live Migrations : Handling

• Write better code• Scheduling policy offered by GCE

• onHostMaintenance : Migrate/Terminate• automaticRestart : True/False

• Since June 2016 : Live migrate notification• 60 seconds prior intimation• Via metadata server

• MIGRATE_ON_HOST_MAINTENANCE• SHUTDOWN_ON_HOST_MAINTENANCE

Page 16: Aerospike & GCE (LSPE Talk)

Local SSDs

• Similar to ephemeral SSDs in AWS (non persistent)• Not to be confused with persistent SSD (network attached)

• Good cost alternative for RAM• Can be attached to any instance types• Spec

• NVMe / SCSI options• Available in chunks of 375GB• ~1ms latency• 680k Read IOPS• 360K Write IOPS

Page 17: Aerospike & GCE (LSPE Talk)

Aerospike benchmark of Local SSD

• Summary : They are pretty good

Page 18: Aerospike & GCE (LSPE Talk)

Local SSD with Aerospike

• Use shadow device configuration in Aerospike• All reads are from local SSD• All writes (buffered) go to

both Local SSD & persistent HDD/SSD (network attached)

• Bcache is no longer recommended by Aerospike• Saw some kernel level

implementation bugs• Saw drive lockups in rare

occurences

Local

SSD

Network

storage

Aerospike

Page 19: Aerospike & GCE (LSPE Talk)

Work in progress

• Aerospike on dockers in GCE

Page 20: Aerospike & GCE (LSPE Talk)

Thank You