[osdc 2013] hadoop cluster ha 的經驗分享

40
Hadoop Cluster HA 的的的的的 Etu 韓韓韓 [email protected]

Upload: tsu-fen-han

Post on 08-May-2015

993 views

Category:

Technology


7 download

DESCRIPTION

Hadoop HA 是個熱門且重要的議題,目前有諸多設計是以 Namenode HA 為主軸,進而延伸至 Job Tracker 和 HMaster。然而,在實作 Hadoop Cluster HA 時,僅考量Namenode、Job Tracker、和 HMaster 仍然不夠嚴謹,在Production的環境,一個Hadoop Cluster 通常還會需要其他的非Hadoop Ecosystem的服務與之協同運作,如 PostgreSQL、Kerberos、Puppet、 NTP 等,這些服務皆需一併規劃與設計,在 HA 被觸發後,讓 Hadoop Cluster 仍可正確運作。本議程將會介紹 Apache Hadoop、Cloudera、和 Hortonworks 於 Hadoop HA 上的解決方案,以及最新發展,並分享第一手的 Etu Appliance HA 作法。 韓祖棻 現任Etu 技術經理

TRANSCRIPT

Page 1: [OSDC 2013] Hadoop Cluster HA 的經驗分享

Hadoop Cluster HA

的經驗分享Etu 韓祖棻[email protected]

Page 2: [OSDC 2013] Hadoop Cluster HA 的經驗分享

2

Who am I

韓祖棻 Jerry – Etu 技術經理

• Database Management

• Windows/Linux Application Developer

• Web Developer

• Developer of Etu

[email protected]

Page 3: [OSDC 2013] Hadoop Cluster HA 的經驗分享

3

Agenda

• Background• Facebook Namenode High Availability• Hadoop 1.0 Namenode High Availability• Hortonworks High Availability• Cloudera High Availability• Etu Appliance High Availability• Conclusion

Page 4: [OSDC 2013] Hadoop Cluster HA 的經驗分享

4

Background

Page 5: [OSDC 2013] Hadoop Cluster HA 的經驗分享

5

The Hadoop Ecosystem

MahoutMahout

HBaseHBase

MapReduceMapReduce

PigPig

HDFS ( Hadoop Distributed File System)HDFS ( Hadoop Distributed File System)

Data Store

Data Processing Layer

Hive Meta StoreHive Meta Store

HiveQLHiveQL

Zooke

eper

Zooke

eper

Avro

(Seri

aliz

ati

on

)Avro

(Seri

aliz

ati

on

)

RDBMSRDBMSETL ToolsETL Tools BI ReportingBI Reporting

Page 6: [OSDC 2013] Hadoop Cluster HA 的經驗分享

6

HDFS cluster consists of a single Namenode.

HDFS Architecture (Master/Slave)

Namenode

Breplication

Rack1 Rack2Client

Blocks

Datanodes Datanodes

Client

Write

Read

Metadata ops

Metadata(Name, replicas..)(/var/disk/data, 1..

Block opsMetadata ops

The Namenode was a sing point of failure

(SPOF) in an HDFS Cluster.

Page 7: [OSDC 2013] Hadoop Cluster HA 的經驗分享

7

Facebook Namenode High Availability

Page 8: [OSDC 2013] Hadoop Cluster HA 的經驗分享

8

AvatarNode

Page 9: [OSDC 2013] Hadoop Cluster HA 的經驗分享

9

Hadoop 1.0 Namenode High Availability

Page 10: [OSDC 2013] Hadoop Cluster HA 的經驗分享

10

Backup Namenode Approach

• Use case 3f: – Active running, Standby down for maintenance. Active dies and cannot start.

Standby is started and takes over as active.

Page 11: [OSDC 2013] Hadoop Cluster HA 的經驗分享

11

Hortonworks High Availability

Page 12: [OSDC 2013] Hadoop Cluster HA 的經驗分享

12

HDPs Full-Stack HA Architecture

Page 13: [OSDC 2013] Hadoop Cluster HA 的經驗分享

13

HA for HDFS NameNode Using VMware

Do not use the NameNode VM for running any other master daemon.

Page 14: [OSDC 2013] Hadoop Cluster HA 的經驗分享

14

HA for Hadoop Using RHEL (v5.x, v6.x)

Page 15: [OSDC 2013] Hadoop Cluster HA 的經驗分享

15

Cloudera High Availability

Page 16: [OSDC 2013] Hadoop Cluster HA 的經驗分享

16

Shared Storage Using NFS (After CDH 4.0)

NNActive

NNStandby

Shared NN state with single writer

(fencing)

DN

FailoverControllerActive

ZK

Monitor Health of NN. OS, HW

DN DN

FailoverControllerStandby

ZK ZKHeartbeat Heartbeat

Monitor Health of NN. OS, HW

SPOF

Page 17: [OSDC 2013] Hadoop Cluster HA 的經驗分享

17

Journal Nodes

Quorum-based Storage (After CDH 4.1)

NNActive

NNStandby

DN

FailoverControllerActive

ZK

Monitor Health of NN. OS, HW

DN DN

FailoverControllerStandby

ZK ZKHeartbeat Heartbeat

Monitor Health of NN. OS, HW

JN JN JN

QJM QJM

JNJNJN

Page 18: [OSDC 2013] Hadoop Cluster HA 的經驗分享

18

Etu Appliance High Availability

Page 19: [OSDC 2013] Hadoop Cluster HA 的經驗分享

19

Summarize previous solutions

  Solution AutoFailover HA Type External

StorageFacebook Avatar Node X Namenode ○Apache Hadoop 1.0 Backup Namenode X Namenode ○

Hortonworks

Vmware (*1) ○ Namenode ○

RHEL (*2) ○ System-wide ○

Cloudera (Apache Hadoop 2.X)

Shared Storage ○ Namenode(*3) Optional

Quorum-based Storage ○ Namenode (*3) Optional

1. 2 ESX Servers + SAN Arch. (vSphere HA Cluster)2. RHEL Cluster HA and Power Fencing Device3. Implementing the Fencing Method for System-wide HA.

Page 20: [OSDC 2013] Hadoop Cluster HA 的經驗分享

20

Two Roles

Master node Worker

Worker

Worker

Master node

Page 21: [OSDC 2013] Hadoop Cluster HA 的經驗分享

21

Services on Master and Workers

  Master Worker

Hadoop Ecosystem Services

Name NodeJob TrackerHBase MasterZookeeper (Leader)Hive

Data NodeTask TrackerRegion ServerZookeeper

System Services

MySQL/PostgreSQLKerberosNTP ServerSyslog

Syslog

Page 22: [OSDC 2013] Hadoop Cluster HA 的經驗分享

22

HA Architecture (Active/Standby)

Page 23: [OSDC 2013] Hadoop Cluster HA 的經驗分享

23

HA based on CDH4.0.1

NNActive

NNStandby

SynchronizedFile System

DN

FailoverControllerActive

ZK

Monitor Health of NN. OS, HW

DN DN

FailoverControllerStandby

ZK ZKHeartbeat Heartbeat

Monitor Health of NN. OS, HW

Page 24: [OSDC 2013] Hadoop Cluster HA 的經驗分享

24

Data Synchronization

• Hadoop ecosystem– Configurations are stored in Zookeeper– Hive meta data is stored in PostgreSQL

• PostgreSQL– Using PostgreSQL Replication

• User data• System configurations or data

– PostgreSQL, Kerberos, NTP server, Syslog

Page 25: [OSDC 2013] Hadoop Cluster HA 的經驗分享

25

Requirements

Active Master Worker

Worker

Standby Master

ZK

ZK

ZK Leader

- HDFS Service is Running in Active Master- Zookeeper Cluster is ready- Standby Master is ready to activate High

Availability service

Page 26: [OSDC 2013] Hadoop Cluster HA 的經驗分享

26

Failover Scenario

Active Master Worker

Worker

Worker

- Active Namenode service failure- Active Namenode JVM failure- Active ZKFC service failure- Etu Active Master OS failure- Etu Active Master machine power failure- Failure of NIC cards on the Etu Active

Master machine- Network failure for the Etu Active Master

machine

Standby Master

Page 27: [OSDC 2013] Hadoop Cluster HA 的經驗分享

27

Design Details – Enabling HA

Active Master Standby Master

1. Stopping services dependent on HDFS. (JobTracker, HMaster, …)

2. Stopping Namenode and Datanode services.

3. Configuring HDFS and FC service.

4. Creating Synchronized File System.

5. Initializing Synchronized File System for share edit logs.

7. Initializing Standby Master.

6. Starting Active FC service.

Namenode JT, HMaster, …

FC

Namenode

FC

edit logs

Kerberos, NTP, Syslog,…

8. Starting Standby FC service.

9. Synchronizing system configurations and data.

10. Starting Active Namenode and Datanode services.

11. Starting Standby Namenode and Datanode services.

12. Checking Services Status.

13. Starting services dependent on HDFS. (JobTracker, HMaster, …)

DB Replication

Kerberos, NTP, …

Page 28: [OSDC 2013] Hadoop Cluster HA 的經驗分享

28

Design Details - Failover

Active Master Standby Master

1. Fencing Active Master from Standby Mastera. Stopping network service.b. Stopping Hadoop related services.c. Stopping system services.d. Configuring network environment.e. Removing default services.

7. Transition Standby Master to Active Master.a. Stopping network service.b. Stopping system services.c. Configuring network environment.d. Configuring host information.e. Configuring system services.f. Starting network service.g. Starting System services.

Namenode JT, HMaster, …

FC

Namenode

FC

edit logs

Kerberos, NTP, Syslog,…

8. Configuring Hadoop related services.

DB Replication

Kerberos, NTP, …

2. Stopping Standby FC service.

3. Stopping Standby Namenode service.

5. Removing DB Replication.

4. Removing Synchronized File System . 9. Starting Namenode and Datanode services.

10. Starting Hadoop related services.

Active Master

Namenode JT, HMaster, …

Kerberos, NTP, …

Fencing

Page 29: [OSDC 2013] Hadoop Cluster HA 的經驗分享

29

Use case -Active Namenode maintenance

Active Master Worker

Worker

Worker

- Stop NN- Restart NN

Standby Master

Page 30: [OSDC 2013] Hadoop Cluster HA 的經驗分享

30

Use case - Standby Master failure

Active Master Worker

Worker

Worker

- OS failure- Power failure- Failure of NICs- Network failure

Standby Master

Page 31: [OSDC 2013] Hadoop Cluster HA 的經驗分享

31

Use case - Cluster power failure

Active Master Worker

Worker

Worker

Standby Master

Page 32: [OSDC 2013] Hadoop Cluster HA 的經驗分享

32

Use case - Cluster network failure

Active Master Worker

Worker

Worker

Standby Master

Page 33: [OSDC 2013] Hadoop Cluster HA 的經驗分享

33

Demo – Non-HA (VM002)

Activating HA with One-Click

Page 34: [OSDC 2013] Hadoop Cluster HA 的經驗分享

34

Demo –Activating (VM002 --- VM007)

Page 35: [OSDC 2013] Hadoop Cluster HA 的經驗分享

35

Demo –Activating Done (VM002 – VM007)

Page 36: [OSDC 2013] Hadoop Cluster HA 的經驗分享

36

Demo –Failover (VM002 –> VM007)

Page 37: [OSDC 2013] Hadoop Cluster HA 的經驗分享

37

Demo –Failover Done (VM007)

Page 38: [OSDC 2013] Hadoop Cluster HA 的經驗分享

38

Conclusion

• Leveraging Synchronized File System to share Namenode edit logs, and system data between Masters.

• Implements improved fencing method to handle failover.

• Providing system-wide high availability, not only for Hadoop Name Node Service.

Page 39: [OSDC 2013] Hadoop Cluster HA 的經驗分享

39

Reference

• Hadoop 1.0.4 Documentation– http://hadoop.apache.org/docs/stable/index.html– https://issues.apache.org/jira/secure/attachment/12480489/Na

meNode%20HA_v2_1.pdf 

• Hadoop 2.0.3-alpha Documentation– http://hadoop.apache.org/docs/r2.0.3-alpha/index.html

• Hadoop AvatarNode High Availability– http://hadoopblog.blogspot.tw/2010/02/hadoop-namenode-high

-availability.html

• Hortonworks Data Platform– http://hortonworks.com/products/hortonworksdataplatform/– http://www.vmware.com/files/pdf/Apache-Hadoop-VMware-HA-s

olution.pdf

Page 40: [OSDC 2013] Hadoop Cluster HA 的經驗分享

40

Reference

• CDH4.2.0 Documentation– http://www.cloudera.com/content/support/en/documentation/cd

h4-documentation/cdh4-documentation-v4-latest.html