replication online
TRANSCRIPT
1
Summer 2012
Open source, high performance database
Replication
2
Why Have Replication?
3
• High Availability (auto-failover)
• Read Scaling (extra copies to read from)
• Backups– Online, Delayed Copy (fat finger)– Point in Time (PiT) backups
• Use (hidden) replica for secondary workload– Analytics– Data-processing– Integration with external systems
Use cases
4
Planned– Hardware upgrade– O/S or file-system tuning– Relocation of data to new file-system / storage– Software upgrade
Unplanned– Hardware failure– Data center failure– Region outage– Human error– Application corruption
Types of outage
5
• A cluster of N servers• All writes to primary• Reads can be to primary (default) or a secondary• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery
Replica Set features
6
• Replica Set is made up of 2 or more nodes
How MongoDB Replication works
Member 1
Member 2
Member 3
7
• Election establishes the PRIMARY• Data replication from PRIMARY to SECONDARY
Member 1
Member 2Primary
Member 3
How MongoDB Replication works
8
• PRIMARY may fail• Automatic election of new PRIMARY if majority exists
Member 1
Member 2DOWN
Member 3
negotiate new master
How MongoDB Replication works
9
Member 1
Member 2DOWN
Member 3 Primary
negotiate new master
How MongoDB Replication works
• New PRIMARY elected• Replica Set re-established
10
• Automatic recovery
Member 1Member 3Primary
Member 2Recovering
How MongoDB Replication works
11
• Replica Set re-established
Member 1Member 3Primary
Member 2
How MongoDB Replication works
12
Understanding automatic failover
13
Primary Election
Primary
Secondary
Secondary
As long as a partition can see a majority (>50%) of the cluster, then it will elect a primary.
14
Simple Failure
Primary
Failed Node
Secondary
66% of cluster visible. Primary is elected
15
Failed Node
33% of cluster visible. Read only mode.
Failed Node
Secondary
Simple Failure
16
Network Partition
Primary
Secondary
Secondary
17
Network Partition
Primary
Secondary
Secondary
Primary
Failed Node
Secondary
66% of cluster visible. Primary is elected
18
Secondary
Network Partition
33% of cluster visible. Read only mode.
Primary
Secondary
Failed Node
Failed Node
Secondary
19
Even Cluster Size
Primary
Secondary
Secondary
Secondary
20
Primary
Secondary
Secondary
Secondary
Failed Node
Secondary
Failed Node
50% of cluster visible. Read only mode.
Secondary
Even Cluster Size
21
Primary
Secondary
Failed Node
Secondary
Failed Node
50% of cluster visible. Read only mode.
Secondary
Secondary
Secondary
Even Cluster Size
22
Avoid single points of failure
23
Avoid Single points of failure
24
Avoid Single points of failure
Primary
Secondary
Secondary
Top of rack switch
Rack falls over
25
Better
Primary
Secondary
Secondary
Loss of internet
Building burns down
26
Better yet
Primary
Secondary
Secondary
San Francisco
Dallas
27
Priorities
Primary
Secondary
Secondary
San Francisco
Dallas
Priority 1
Priority 1
Priority 0
Disaster recover data center. Will never become primary automatically.
28
Even Better
Primary
Secondary
Secondary
San Francisco
Dallas
New York
29
Fast recovery
30
2 Replicas + Arbiter??
Primary
Arbiter
Secondary Is this a good idea?
31
Primary
Arbiter
Secondary
1
2 Replicas + Arbiter??
32
Primary
Arbiter
Secondary
Primary
Arbiter
Secondary
1 2
2 Replicas + Arbiter??
33
2 Replicas + Arbiter??
Primary
Arbiter
Secondary
Primary
Arbiter
Secondary
1 2
Primary
Arbiter
Secondary
3
Secondary
Full Sync
Uh oh. Full Sync is going to use a lot of resources on the primary. So I may have downtime or degraded performance
34
Primary
Secondary
1
Secondary
With 3 replicas
35
With 3 replicas
Primary
Secondary
Primary
Secondary
1 2
Secondary Secondary
36
Primary
Secondary
Primary
Secondary
1 2
Primary
Secondary
3
Secondary
Full Sync
Sync can happen from secondary, which will not impact traffic on Primary.
Secondary Secondary Secondary
With 3 replicas
37
• Avoid single points of failure – Separate racks– Separate data centers
• Avoid long recovery downtime– Use journaling – Use 3+ replicas
• Keep your actives close – Use priority to control where failovers happen
Replica Set Topology
38
Q&A after this session