hadoop operations #cwt2013

45
1 Hadoop Operations 構築・運のポイント Cloudera カスタマーオペレーションズエンジニア 輔 | @d1ce 2013117

Upload: cloudera-japan

Post on 28-May-2015

2.629 views

Category:

Technology


2 download

DESCRIPTION

#cwt2013 Clouderaの小林 @d1ce_ によるHadoop構築・運用のポイントについてのスライドを公開しました。2013年度版ハードウェア選定、HA構成の考え方から、実際にサポートで直面した事例についても紹介しています

TRANSCRIPT

  • 1. Hadoop Operations ~ Cloudera | @d1ce_ 20131171

2. email:[email protected] twi3er:d1ce_ 2 3. Hadoop 311 (?) Cloudera Eric Sammer 4. Hadoop Hadoop http://www.slideshare.net/Cloudera_jp/hadoop-159446924HDFS/MapReduce, HBase 5. 5 6. 6 7. = Hadoop :h3p://www.cloudera.co.jp/blog/how-to-select-the-right-hardware-for-your-new-hadoop-cluster.html 71-4TB HDD 12-24 (JBOD) CPU2-2.5GHz 4/6/8 CPU 2 RAM64-512GB (Impala 128GB ) 10Gbit (20 1Gbit) 8. IO CPU 8 9. IO CPU 92TB-4TB 16-24 CPU4 2 RAM48-96GB 1TB-2TB 4-8 CPU6 2 RAM64-512GB 10. NIC RAID1 101TB HDD 4-6 OS (1) + fsimage (2:RAID1) + ZooKeeper (1) + JournalNode (1) CPU2-2.5GHz 4/6/8 CPU 2 RAM64-128GB 10Gbit (20 1Gbit) 11. 11 12. CDH4 12(GA ) CDH4.411/7 HA SPOF HBaseHive 13. HA HA (NFS ) / JournalNode (edits ) 3 5HDFS HA 13ZooKeeper 3 ZooKeeperFailoverController (ZKFC) 2http://www.slideshare.net/Cloudera_jp/hdfs-ha 14. 14 15. HA / 15ZooKeeper 3 ZooKeeperFailoverController (ZKFC) 2 16. 16 4 -> 3 Hadoop () / HA 3 17. 1 ZooKeeperZooKeeperZKFCZKFC ZooKeeperZKFC JounalNode 1 17ZKFC JounalNodeJounalNode23 18. 2 ZooKeeperZKFC ZKFC HMaster JounalNode 18ZooKeeperZKFC ZKFCZooKeeper HMaster Impala StateStore JounalNode JounalNode 19. 19 20. OS Transparent Huge Page #echonever>sys/kernel/mm/redhat_transparent_hugepage/defragRHEL 6.2 and 6.3 CentOS 6.2 and 6.3 SLES 11 SP2 OS http://structureddata.org/2012/06/18/linux-6-transparent- huge-pages-and-hadoop-workloads/vm.swappiness 60 0 20#sysctl-wvm.swappiness=0 21. HDFS 100 1GB 128MB fsimage dfs.namenode.name.dir RAID1 JournalNode 21small les problem dfs.journalnode.edits.dir RAID1 22. HDFS 30 1-4GB / 22dfs.datanode.balance.bandwidthPerSec $ sudo-uhdfshdfsdfsadmin-setBalancerBandwidth dfs.namenode.replicaKon.work.mulKplier.per.iteraKon 2(5) 23. HDFS Hive HA 1. 2.3.23Hive metatool $hive--servicemetatool-listFSRoot $hive--servicemetatool-updateLocaKon hdfs://nameservice1hdfs://oldnamenode.com Hive HA non-HA 24. HDFS 1 24 missing replica 25. HDFS 1 25 missing replica DN hdfs-site.xml dfs.datanode.data.dir HDFS ClouderaManager 26. HDFS 2 26 ZKFC 27. HDFS 2 ZKFC ZKFC ZK ZK (60) 27maxClientCnxns 28. HDFS 2 ZKFC ZKFC ZK ZK (60) 28maxClientCnxns 29. HDFS 29 3 -> 25% 4-5 + OS HDFS 1 32TB 8TB 30. MapReduce 1. 2.30 31. MapReduce 1. 2. 31 100 5-6KB -> 10000 50MB -> 5GB mapred.jobtracker.completeuserjobs.maximum 5 32. MapReduce 1. 2. 32 JT WebUI 5 30000 170MB mapred.job.tracker.jobhistory.lru.cache.size 33. MapReduce WebUI http://docs.oracle.com/javase/jp/6/api/java/lang/Runtime.html#totalMemory%28%29 http://docs.oracle.com/javase/jp/6/api/java/lang/Runtime.html#maxMemory%28%29 Hadoop 1. 2.33http://:50030/jmx java.lang:type=Memory -> "used" 34. MapReduce = CPU - (Map mapred.tasktracker.map.tasks.maximum + Reduce mapred.tasktracker.reduce.tasks.maximum) (mapred.child.java.opts-Xmx) TaskTracker 34 35. HBase HMaster 2-4GB (RS) 3 1. 2. 3.GC -> ZK RS OutOfMemoryError -> 16GB -> RS HMaster WebUI http://www.slideshare.net/Cloudera_jp/hbase-hcj13w35 36. HBase ZK () RS ZK hbase.zookeeper.property.maxClientCnxns 36CDH4 300 CDH3 30 300 - 1000 hbasehbck-details2>&1|teelename.txt echo"scan'.META.'"|hbaseshell>META-output.txt 37. HBase HBase 2 MR 1. 2. 3. 4.37 TT RS 38. HBase (A) (B) RS ZK RS RS MR 38RS ZK RS RS MR 39. 39 40. Hadoop 40 41. Hadoop HBase Hadoop Hadoop Hadoop http://enterprisezine.jp/article/corner/220/ Hadoop HBase Clouderahttp://cloudera.co.jp/university 41 42. Hadoop 42 3 43. Cloudera Impala PDF Cloudera John Russell HadoopHBaseHadoop Hive Cloudera Cloudera World Tokyo 43 44. WeareHiring! Cloudera HadoopHadoop [email protected] 44 45. 45