overprov a tool for cluster overprovisioning detection

12
Overprov: A Tool for Cluster Overprovisioning Detection Del Bao

Upload: del-bao

Post on 16-Apr-2017

58 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: Overprov  a tool for cluster overprovisioning detection

Overprov: A Tool for Cluster Overprovisioning Detection

Del Bao

Page 2: Overprov  a tool for cluster overprovisioning detection

Problemad_backend cpu.idle uswest2-prod

Page 3: Overprov  a tool for cluster overprovisioning detection

Problem (2)

bizfeed oldgen gc count a day

Page 4: Overprov  a tool for cluster overprovisioning detection

Problem (3)generic cassandra byte_percentfree

Page 5: Overprov  a tool for cluster overprovisioning detection

what does the tool do?

Page 6: Overprov  a tool for cluster overprovisioning detection

Design Goals

• save cost in the long run

• based on simple rules

• eliminate false positive

• extensible

Page 7: Overprov  a tool for cluster overprovisioning detection

Code Structure● run()

for cluster_name in clusters:

dt = detector.ClusterOverprovDetector(

product,

ecosystem,

cluster_name,

metric_list,

start,

stop,

signalfx_auth_token

dt.execute()● metric_list

metric_list_cass = [

ModuleClass('overprov.analyzers.cpu_idle_analyzer', 'CpuIdleAnalyzer'),

ModuleClass('overprov.analyzers.cass_gc_count_analyzer', 'CassGcCountAnalyzer'),

ModuleClass('overprov.analyzers.cass_disk_free_analyzer', 'CassDiskFreeAnalyzer'),

]

Page 8: Overprov  a tool for cluster overprovisioning detection

You can extend it

• create your own analyzer

• pass in your start, stop day

Page 9: Overprov  a tool for cluster overprovisioning detection

Assumptions

• static check, so the daily/hourly resolution, e.g., p95 is fine.

• cluster is almost well balanced, so take max/min across cluster hosts in a region represents the entire cluster

Page 10: Overprov  a tool for cluster overprovisioning detection

What it’s Not

• Fleetmiser– Instantaneous autoscale spot fleet for seagull

clusters– a signal of 10 min interval

• Paasta– similar to above, only for paasta service

Page 11: Overprov  a tool for cluster overprovisioning detection

Demo• virtualenv_run/bin/overprov -p cassandra -c

ad_backend --start 60 --stop 30 -e prod -k ./api_token

• virtualenv_run/bin/overprov -p cassandra -c ad_backend --start 60 --stop 30 -e prod -k ./api_token --debug

• virtualenv_run/bin/overprov -p elasticsearch -c ads144 --start 60 --stop 30 -e prod -k ./api_token

Page 12: Overprov  a tool for cluster overprovisioning detection

Questions