target: performance tuning cassandra at target
TRANSCRIPT
1 about Danny
2 Target’s C* Background
3 performance tuning
4 miscellaneous
5 Q&A
2© 2015. All Rights Reserved.
Danny Parker @dcparker88
• engineer at Target• API Team• administrate our team’s c* rings• mostly ops side of DevOps
3© 2015. All Rights Reserved.
Target’s API Team
• internal and external consumers• full-stack ownership• great deal of autonomy• example APIs: products, inventory,
location
4© 2015. All Rights Reserved.
C* and the API Team
• first team at Target to implement c*• operational data store• made many mistakes• learned a great deal
5© 2015. All Rights Reserved.
Setting the Stage
• open source 2.1.x• multiple clusters• largest/first cluster:
• 24 nodes• ~8k reads/sec, ~6k writes/sec• locations, products
6© 2015. All Rights Reserved.
Our Traffic Patterns
• follows shopping activity of guests• more than 10x during peak (holiday season)• affected by sales, limited releases• products is the largest
7© 2015. All Rights Reserved.
General Tuning Tips
• change one thing at a time and test• you absolutely need:
• metrics (system and c*)• logs• jvm access (jstat is good)• nodetool• production-like traffic
• go through many iterations
8© 2015. All Rights Reserved.
Cassandra Issues
• too many tombstones• compaction causing high load• cpu-bound servers• long gc times• many SSTables per read
9© 2015. All Rights Reserved.
Cassandra configuration changes
• recommended settings• offheap_objects (memtables)• concurrent_reads/concurrent_writes• memtable_flush_writers• JSON logs with logback
10© 2015. All Rights Reserved.
Schema Changes
• change to soft deletes• secondary indices (boo)• leveled vs size tiered compaction• tombstone_threshold
11© 2015. All Rights Reserved.
JVM Changes
• tenuring threshold...use it!• we have a larger heap
• eden: 10g• total: 20g
• parallel gc threads• cassandra-8150
12© 2015. All Rights Reserved.
Client Changes
• datastax driver• use dc and token aware• make sure the cassandra
connection settings are sane• trim down # of columns• pagination
13© 2015. All Rights Reserved.
General Tips
• bump compaction throughput overnight• compaction throttling• manage your repairs!
• use some sort of job manager• make sure to get logs/stats from any process
14© 2015. All Rights Reserved.
Current State
• ParNew much faster• 5x increase in speed
• fewer GCs• 15k/hr -> 3k/hr
• much more stable• c* no longer the bottleneck
15© 2015. All Rights Reserved.
What We’re Working Toward
• track all inventory in c* (high volume)
• transition c* to become system of record
• utilize the public cloud• spanned cluster (dc and cloud)• specific clusters for workloads
16© 2015. All Rights Reserved.
Miscellaneous
• http://target.github.io/• https://github.com/target/dse-cookbook
17© 2015. All Rights Reserved.