replication online

38
1 Summer 2012 Open source, high performance database Replicati on

Upload: mongodb

Post on 21-Jun-2015

691 views

Category:

Business


0 download

TRANSCRIPT

Page 1: Replication Online

1

Summer 2012

Open source, high performance database

Replication

Page 2: Replication Online

2

Why Have Replication?

Page 3: Replication Online

3

• High Availability (auto-failover)

• Read Scaling (extra copies to read from)

• Backups– Online, Delayed Copy (fat finger)– Point in Time (PiT) backups

• Use (hidden) replica for secondary workload– Analytics– Data-processing– Integration with external systems

Use cases

Page 4: Replication Online

4

Planned– Hardware upgrade– O/S or file-system tuning– Relocation of data to new file-system / storage– Software upgrade

Unplanned– Hardware failure– Data center failure– Region outage– Human error– Application corruption

Types of outage

Page 5: Replication Online

5

• A cluster of N servers• All writes to primary• Reads can be to primary (default) or a secondary• Any (one) node can be primary• Consensus election of primary• Automatic failover• Automatic recovery

Replica Set features

Page 6: Replication Online

6

• Replica Set is made up of 2 or more nodes

How MongoDB Replication works

Member 1

Member 2

Member 3

Page 7: Replication Online

7

• Election establishes the PRIMARY• Data replication from PRIMARY to SECONDARY

Member 1

Member 2Primary

Member 3

How MongoDB Replication works

Page 8: Replication Online

8

• PRIMARY may fail• Automatic election of new PRIMARY if majority exists

Member 1

Member 2DOWN

Member 3

negotiate new master

How MongoDB Replication works

Page 9: Replication Online

9

Member 1

Member 2DOWN

Member 3 Primary

negotiate new master

How MongoDB Replication works

• New PRIMARY elected• Replica Set re-established

Page 10: Replication Online

10

• Automatic recovery

Member 1Member 3Primary

Member 2Recovering

How MongoDB Replication works

Page 11: Replication Online

11

• Replica Set re-established

Member 1Member 3Primary

Member 2

How MongoDB Replication works

Page 12: Replication Online

12

Understanding automatic failover

Page 13: Replication Online

13

Primary Election

Primary

Secondary

Secondary

As long as a partition can see a majority (>50%) of the cluster, then it will elect a primary.

Page 14: Replication Online

14

Simple Failure

Primary

Failed Node

Secondary

66% of cluster visible. Primary is elected

Page 15: Replication Online

15

Failed Node

33% of cluster visible. Read only mode.

Failed Node

Secondary

Simple Failure

Page 16: Replication Online

16

Network Partition

Primary

Secondary

Secondary

Page 17: Replication Online

17

Network Partition

Primary

Secondary

Secondary

Primary

Failed Node

Secondary

66% of cluster visible. Primary is elected

Page 18: Replication Online

18

Secondary

Network Partition

33% of cluster visible. Read only mode.

Primary

Secondary

Failed Node

Failed Node

Secondary

Page 19: Replication Online

19

Even Cluster Size

Primary

Secondary

Secondary

Secondary

Page 20: Replication Online

20

Primary

Secondary

Secondary

Secondary

Failed Node

Secondary

Failed Node

50% of cluster visible. Read only mode.

Secondary

Even Cluster Size

Page 21: Replication Online

21

Primary

Secondary

Failed Node

Secondary

Failed Node

50% of cluster visible. Read only mode.

Secondary

Secondary

Secondary

Even Cluster Size

Page 22: Replication Online

22

Avoid single points of failure

Page 23: Replication Online

23

Avoid Single points of failure

Page 24: Replication Online

24

Avoid Single points of failure

Primary

Secondary

Secondary

Top of rack switch

Rack falls over

Page 25: Replication Online

25

Better

Primary

Secondary

Secondary

Loss of internet

Building burns down

Page 26: Replication Online

26

Better yet

Primary

Secondary

Secondary

San Francisco

Dallas

Page 27: Replication Online

27

Priorities

Primary

Secondary

Secondary

San Francisco

Dallas

Priority 1

Priority 1

Priority 0

Disaster recover data center. Will never become primary automatically.

Page 28: Replication Online

28

Even Better

Primary

Secondary

Secondary

San Francisco

Dallas

New York

Page 29: Replication Online

29

Fast recovery

Page 30: Replication Online

30

2 Replicas + Arbiter??

Primary

Arbiter

Secondary Is this a good idea?

Page 31: Replication Online

31

Primary

Arbiter

Secondary

1

2 Replicas + Arbiter??

Page 32: Replication Online

32

Primary

Arbiter

Secondary

Primary

Arbiter

Secondary

1 2

2 Replicas + Arbiter??

Page 33: Replication Online

33

2 Replicas + Arbiter??

Primary

Arbiter

Secondary

Primary

Arbiter

Secondary

1 2

Primary

Arbiter

Secondary

3

Secondary

Full Sync

Uh oh. Full Sync is going to use a lot of resources on the primary. So I may have downtime or degraded performance

Page 34: Replication Online

34

Primary

Secondary

1

Secondary

With 3 replicas

Page 35: Replication Online

35

With 3 replicas

Primary

Secondary

Primary

Secondary

1 2

Secondary Secondary

Page 36: Replication Online

36

Primary

Secondary

Primary

Secondary

1 2

Primary

Secondary

3

Secondary

Full Sync

Sync can happen from secondary, which will not impact traffic on Primary.

Secondary Secondary Secondary

With 3 replicas

Page 37: Replication Online

37

• Avoid single points of failure – Separate racks– Separate data centers

• Avoid long recovery downtime– Use journaling – Use 3+ replicas

• Keep your actives close – Use priority to control where failovers happen

Replica Set Topology

Page 38: Replication Online

38

Q&A after this session