map reduce学习报告

20
MapReduce Implementation Intro. Anty.Rao (@gmail.com) Nov 2,2011

Upload: anty-rao

Post on 07-Aug-2015

497 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Map reduce学习报告

MapReduce Implementation Intro.

Anty.Rao(@gmail.com)

Nov 2,2011

Page 2: Map reduce学习报告

Outline

• Map Reduce Overview• Map Phase• Reduce Phase• Potential Optimization

Page 3: Map reduce学习报告

Map Reduce Overview

Hadoop—The Definition Guide

Page 4: Map reduce学习报告

Map Phase

Page 5: Map reduce学习报告

Map Phase Diagram

Page 6: Map reduce学习报告

Steps of Map Phase

• Put records emitted by map function into circle buffer continually

• When buffer usage space exceed io.sort.mb*io.sort.spill.percent, spill will start which will sort records by partition, key-part, then write out buffer onto disk, with a index file associated with it indicating the positions where partition begins.

• Merge will combine all the intermediate files into a single large file, plus a index file.

Page 7: Map reduce学习报告

Main map-side tuning Knobs

Page 8: Map reduce学习报告

Reduce Phase

Page 9: Map reduce学习报告

Reduce Phase Diagram

Page 10: Map reduce学习报告

Steps of Reduce Phase

• Pull over data from map, if there is space available In memory & the size of file is less than 25%*HeapSize*mapred.job.shuffle.input.buffer.percent, put file in memory, else directly store file on disk.

Page 11: Map reduce学习报告

Steps of Reduce Phase(Cont.)

• Merge operation will merge and sort data from memory and/or disk and write result on disk. Merge operation come in two different flavors:– In-memory merge operation

• In-memory merge operation can be triggered when accumulated memory space exceed mapred.job.shuffle.merge.percent.

– On-disk merge operation• On-disk merge operation will be triggered when # of

files on disk exceed configured threshold.

Page 12: Map reduce学习报告

Steps of Reduce Phase(Cont.)

• When shuffle and sort complete, before feeding reduce function, it must satisfy the following constraints: – memory usage for buffering reduce input can’t

exceed mapred.job.reduce.input.buffer.percent; – # of files on disk can’t exceed io.sort.factor

Page 13: Map reduce学习报告

Notes about Reduce

• Shuffle & sort take up % of Reduce heap size to buffer shuffle data, because Reduce can’t start until shuffle and sort complete. As opposed to Map phase, which buffer size is determined by io.sort.mb.

• Reduce input may contains multiple files, not necessarily a single file. Just using a heap iterator to feed reduce function.

Page 14: Map reduce学习报告

Reduce-side Key parameters

Page 15: Map reduce学习报告

Optimization Tuning

• We can make use of mapred.job.reduce.input.buffer.percent which specify how much memory can be spared to use as reduce input buffer

• Look at the difference between the following cases– Case-1– Case-2– Case-3

Page 16: Map reduce学习报告

Case-1

All reduce input reside on disk

Page 17: Map reduce学习报告

Case-2

Partial data in memory ,plus data on disk as reduce input

Page 18: Map reduce学习报告

Case-3

Much better, all data in memory

Page 19: Map reduce学习报告

• If reduce function don’t stress memory too much, we can spare some memory to buffer reduce input to boost overall performance.

• What’s more, if input data is small, we can let reduces hold all intermediate data in memory, not involving disk access.

Page 20: Map reduce学习报告

Potential optimization(?)• In that Reduce input files reside on local FS, maybe we can optimize

disk access (read and write)with local file system API, such as mmap, Without using HDFS API.

• Transfer local map output during shuffle phase, maybe we can use more efficiency network API to improve data transfer efficiency between nodes, such as sendfile()?

• Currently reduce randomly choose map to fetch map output, maybe we can use smart schedule policy to improve shuffle performance.

• Map and Reduce may have different memory need, configure JVM options separately.– mapred.map.child.java.opts– mapred.reduce.child.java.opts