masakari: virtual machine high availability for openstack
TRANSCRIPT
© 2015 NTT Software Innovation Center
Masakari: Virtual Machine-HA for OpenStack
27/Oct/2015Masahito Muroi, NTT
2Copyright©2015 NTT corp. All Rights Reserved.
What’s Masakari
• In general context• まさかり (Masakari) is Japanese word for an “axe” or a “hatchet”
• Used for cutting down trees, Not weapon• Trademark for 金太郎 (KINTARO)
• Name of the Japanese fairy story and its main character’s name
• In engineering context• “ まさかりを投げる (masakari wo nageru)”• Roughly translated “Throwing a Masakari”• Meaning “point out a mistake in conferences or presentations”
• In OpenStack context• Virtual Machine High Availability (VM-HA) service
• Rescue Virtual Machine when any errors occur• Published as OSS at github https://github.com/ntt-sic/masakari
Copyright © いらすとや. All Rights Reserved.
3Copyright©2015 NTT corp. All Rights Reserved.
Motivations
• Pets vs Cattle
• Unable to change all Apps to Cloud Native at once
• Open Source
4Copyright©2015 NTT corp. All Rights Reserved.
Requirements for Pets Model
• Detect 3 types of VM down• Unexpected VM down• VM manager down• Host down
• Recover VM within 10 mins
• Work automatically
5Copyright©2015 NTT corp. All Rights Reserved.
Architecture OverviewC
ompu
te N
odes
Con
trolle
r Nod
es
& B
acke
nd N
odes
6Copyright©2015 NTT corp. All Rights Reserved.
How to detect the 3 down
• VM down• monitoring libvert’s events
• Manager Process down • Monitoring manager process
• Host down• Using Pacemaker
7Copyright©2015 NTT corp. All Rights Reserved.
Detect VM Down
Libvirt
Masakari
1. Notify down VM’s Info(VM-ID, Host Name, etc.)
Libvirt Monitor
Detect VM downVM1 VM2 VM3
Libvirt
Libvirt Monitor
VM5 VM6
HostHost
Nova
2. Call Rebuild API for the down VM
3. Rebuild the VM
Down
8Copyright©2015 NTT corp. All Rights Reserved.
Manager Process Down
1. Restart manager process when it’s down
Process Monitor
Masakari
2. Notify manager process down if fail to restart few times
Libvirt Nova-compute
Host A
Libvirt Nova-compute
Host B
Nova
3. Notify Nova to disable schedule for Host A
Process Monitor
Down
9Copyright©2015 NTT corp. All Rights Reserved.
Host Down
RA
CIB
RA
RA
Node’s Status
pacemaker
Heartbeat communications
Masakari
Check its Host’s status
1. Notify another host down
StartStopMonitor
WatchD
og & S
hutdowner
Host Fail Monitor
Polling RA
CIB
RA
RA
Node’s Status
pacemaker
StartStopMonitor
WatchD
og & S
hutdowner
Host Fail Monitor
Polling
Down
Host A Host B
Nova
2. Call Evacuate API for all VM on Host B
10Copyright©2015 NTT corp. All Rights Reserved.
How to use Masakari
1. Prerequisites• Set up Nova and Compute Nodes with KVM• Set up a shared storage per cluster for ephemeral disks (e.g. NFS)
2. Install and Configure Masakari• Download source from github
• https://github.com/ntt-sic/masakari• Install Masakari’s package• Initialize Masakari’s DB• Configure 4 Masakari’s config files
3. Start Masakari• Start all process• Add a reserved host prepared for host down
4. Wait any error• Masakari only works when any error occurs
11Copyright©2015 NTT corp. All Rights Reserved.
Challenges
• No branch from OpenStack master
12Copyright©2015 NTT corp. All Rights Reserved.
Other session related to Masakari
• Korejanai Story: How To Integrate OpenStack Into Your Business Strategy (http://sched.co/49wG)
13Copyright©2015 NTT corp. All Rights Reserved.
Github: https://github.com/ntt-sic/masakariMail: [email protected] Place: S14 NTT Group