hadoop muti node cluster 安裝 & fabric & ansible

27
Intern 技技技技 2014/08/11 黃黃黃

Upload: zzz0982938

Post on 24-May-2015

202 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: Hadoop muti node cluster 安裝 & fabric & ansible

Intern 技術分享2014/08/11

黃泓霖

Page 2: Hadoop muti node cluster 安裝 & fabric & ansible

1. Hadoop Muti-node Cluster Setup

Page 3: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview• 透過分散式架構的 HDFS 搭配可分散運算的 MapReduce

演算方法,將多臺伺服器組合成分散式運算和儲存的叢集(Cluseter) ,來提供巨量資料的儲存和處理能力

• MapReduce→ 先拆解任務,分工處理再彙總結果

• Master & Slaves

Page 4: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview

Page 5: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview

Page 6: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview• Hadoop 2.2.x → YARN

• 將 JobTracker 的兩個主要的功能:資源管理和作業生命週期管理分成不同的部分

Page 7: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview

Page 8: Hadoop muti node cluster 安裝 & fabric & ansible

1.1 Hadoop Cluster Overview

Page 9: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup

Software Versions• Ubuntu Linux 12.04.4 LTS• Hadoop 2.2.0

Page 10: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup

• 在每台 VM 上安裝 JAVA 、 Hadoophttp://www.slideshare.net/recast203/hadoop-cluster

• Networking sudo vim /etc/hosts (for master AND slave) ( 可用 ifconfig 查看 IP)

10.0.3.176 master 10.0.3.184 honglin10110.0.3.223 honglin102

Page 11: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup• SSH access ssh-keyegen -t rsa -P "“ ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@honglin101 ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@honglin102 ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@master

Page 12: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup• 修改 core-site.xml<?xml version="1.0" encoding="UTF-8"?><configuration><property><name>fs.default.name</name><value>hdfs://master:9000</value></property> <property><name>hadoop.tmp.dir</name><value>/usr/local/hadoop/etc/hadoop/tmp</value></property> </configuration>

Page 13: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup• 修改yarn-site.xml<?xml version="1.0"?>

<configuration><property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8031</value></property>

<property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value></property>

<property><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value></property>

<property><name>yarn.resourcemanager.address</name><value>master:8032</value></property>

<property><name>yarn.nodemanager.local-dirs</name><value>${hadoop.tmp.dir}/nodemanager/local</value></property>

Page 14: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup<property><name>yarn.nodemanager.address</name><value>0.0.0.0:8034</value></property>

<property><name>yarn.nodemanager.remote-app-log-dir</name><value>${hadoop.tmp.dir}/nodemanager/remote</value></property>

<property><name>yarn.nodemanager.log-dirs</name><value>${hadoop.tmp.dir}/nodemanager/logs</value></property>

<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>

<property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>

</configuration>

Page 15: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup• Do only on master hadoop namenode –format

啟動 / 關閉 DFS 檢查start-dfs.shstop-dfs.sh

Jps 檢查運作

Page 16: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup

啟動 / 關閉 yarnstart-yarn.shstop-yarn.sh

Page 17: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup

Page 18: Hadoop muti node cluster 安裝 & fabric & ansible

1.2 Setup

Page 20: Hadoop muti node cluster 安裝 & fabric & ansible

2. 管理多台 Linux 機器的 command tool

Fabric

Page 21: Hadoop muti node cluster 安裝 & fabric & ansible

2.1 Introduction of Fabric• Fabric 是一個自動化的通過 SSH 在多台機器上批量執行

程序的框架。

• 利用事先編輯好的項目配置文件,可以實現項目的自動部署和維護。

• 整個操作都在本地的當前目錄進行,非常方便。

Page 22: Hadoop muti node cluster 安裝 & fabric & ansible

2.2 Installation• 使用 pip 安裝• sudo pip install fabric

Page 23: Hadoop muti node cluster 安裝 & fabric & ansible

2.3 基本使用• Fabric 可以透過 command line 或者是讀

取 fabfile.py  檔案方式來執行, fabfile.py 務必放在執行 fab command 的目錄底下,也就是的命令列所在位置 。

• 如果不透過 fabfile.py 檔案的話,你直接打 fab 會得到 Couldn’t find any fabfiles!

Page 24: Hadoop muti node cluster 安裝 & fabric & ansible

2.3 基本使用

Page 25: Hadoop muti node cluster 安裝 & fabric & ansible

2.3 基本使用

Page 27: Hadoop muti node cluster 安裝 & fabric & ansible

3. Ansible