[2a3]big data launching episodes

53

Upload: naver-d2

Post on 02-Dec-2014

1.054 views

Category:

Technology


0 download

DESCRIPTION

DEVIEW 2014 [2A3]Big Data Launching Episodes

TRANSCRIPT

Page 1: [2A3]Big Data Launching Episodes
Page 2: [2A3]Big Data Launching Episodes
Page 3: [2A3]Big Data Launching Episodes

안성화 Manager / Data Tech Lab SK Telecom

Big Data Launching Episodes

Page 4: [2A3]Big Data Launching Episodes

1. Accessibility 2. Expansion 3. Lessens Learned 4. Future

CONTENTS

Page 5: [2A3]Big Data Launching Episodes

1. Accessibility

10  GB/Hour 100  MB/Hour

GroupBy  &  Sum

SKT 최초의 Hadoop 시스템

Page 6: [2A3]Big Data Launching Episodes

MapReduce에서 Hive로 1. Group By & Sum !!!!2. UDF & UDAF

Accessibility

Map !Group By Key별 수집

Reduce !Group By Key별 Sum

Map !UDF

Reduce !UDAF

Select  key1,  sum(key1)  From  Table  Group  by  key1;

Select  udf(key1)  From  Table;

Select  key1,  udaf(key1)  From  Table  Group  by  key1;

Page 7: [2A3]Big Data Launching Episodes

MapReduce에서 Hive로 3. Transform !

Accessibility

Map !Transform

Reduce !Transform

FROM  (      FROM  records2      MAP  year,  temperature,  quality      USING  'is_good_quality.py'      AS  year,  temperature)  map_output  REDUCE  year,  temperature  USING  'max_temperature_reduce.py'  AS  year,  temperature;

Hadoop Definitive Guide

Page 8: [2A3]Big Data Launching Episodes

MapReduce에서 Hive로 4. GUI (Hue)

Accessibility

http://gethue.com/wp-content/uploads/2014/03/hue-3.6.png

Page 9: [2A3]Big Data Launching Episodes

Accessibilityinsert overwrite table tmp_daily_recommendation partition(silo) select x.probe_mgmt_string, x.marang_made_spring, x.marang_made_nm, x.bf_m2_series, x.chg_dev_silo, x.prob_simple, x.prob_simple_ko, x.prob_outs, x.prob_outs_ko, x.choosen, x.silo from ( select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'chg' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple <= mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '21' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11', '12')) or probe_chg_spring in ('1','2')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_simple_ko asc limit 10000 union all select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'out' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple > mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '021' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11','12')) or probe_chg_spring in ('31','32')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_outs_ko asc limit 10000) x

Page 10: [2A3]Big Data Launching Episodes

Accessibility

6491.4

3200

Page 11: [2A3]Big Data Launching Episodes

Accessibility

Job ID … Map % Map Total Maps Completed

Job_1 12% 120,000 12,000Job_2 0% 512 0

… … … … …

Page 12: [2A3]Big Data Launching Episodes

Fair Scheduler로 Queue별 Quota 설정

!!!!

Accessibility

tom

jerry

default

0% 50% 100%

30%

30%

20%

40%

40%

Fair ShareOver Fair Share

Load 1!(Transform)

특정 Queue만 사용할 경우 다소 억제 다수의 Queue가 동시에 사용될 경우 여전히 문제

독점 사용 문제 해결

set mapred.job.queue.name=tom;

Page 13: [2A3]Big Data Launching Episodes

Accessibility

a | 1 a | 38 a | 45 b | 9 a | 34 a | 12 a | 78

code value SELECT  code,  sum(value)  FROM      Table  GROUP  BY  code;

Mapper

Mapper

Mapper

a!Reducer

b!Reducer

왜  99%에서  안  끝나죠?!!!

Page 14: [2A3]Big Data Launching Episodes

Accessibility

select  /*+  MAPJOIN(b)  */  count(*)    from  tableA  a  join  tableB  b  on      (a.id  =  b.id);

원래보다  너무  느려요!!!

Page 15: [2A3]Big Data Launching Episodes

Accessibility

hadoop  fs  -­‐text  xxx.snappy  >  xxx.gzip  hadoop  fs  -­‐put  xxx.gzip  /

fasdjlkfjlaksjdfljasdfjlkasdjfljau82n381qslfj8329ruqw9ufoiau8qwue899288uq98r912ioquq

UnSplittable!!  Only  1  Mapper!!

Page 16: [2A3]Big Data Launching Episodes

http://www.bbc.co.uk/bitesize/ks3/maths/shape_space/2d_shapes/revision/3/

Accessibility

Page 17: [2A3]Big Data Launching Episodes

Accessibility

insert overwrite table tmp_daily_recommendation partition(silo) select x.probe_mgmt_string, x.marang_made_spring, x.marang_made_nm, x.bf_m2_series, x.chg_dev_silo, x.prob_simple, x.prob_simple_ko, x.prob_outs, x.prob_outs_ko, x.choosen, x.silo from ( select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'chg' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple <= mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '21' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11', '12')) or probe_chg_spring in ('1','2')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_simple_ko asc limit 10000 union all select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'out' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple > mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '021' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11','12')) or probe_chg_spring in ('31','32')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_outs_ko asc limit 10000) x

Page 18: [2A3]Big Data Launching Episodes

1,000,000,000,000

Accessibility

Page 19: [2A3]Big Data Launching Episodes

1,000,000,000,000

100 TB

Accessibility

Page 20: [2A3]Big Data Launching Episodes

1,000,000,000,000

100 TB4 days 4 hours

Accessibility

Page 21: [2A3]Big Data Launching Episodes

4 days 4 hours

Accessibility

insert overwrite table tmp_daily_recommendation partition(silo) select x.probe_mgmt_string, x.marang_made_spring, x.marang_made_nm, x.bf_m2_series, x.chg_dev_silo, x.prob_simple, x.prob_simple_ko, x.prob_outs, x.prob_outs_ko, x.choosen, x.silo from ( select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'chg' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple <= mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '21' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11', '12')) or probe_chg_spring in ('1','2')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_simple_ko asc limit 10000 union all select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'out' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple > mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '021' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11','12')) or probe_chg_spring in ('31','32')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_outs_ko asc limit 10000) x

Page 22: [2A3]Big Data Launching Episodes

http://www.ldn.net.au/wp-content/uploads/2013/09/boost-your-marketing-effectiveness.png

Page 23: [2A3]Big Data Launching Episodes

http://fc06.deviantart.net/fs70/f/2009/357/f/c/Hopelessness_by_sarafim.jpg

1월 2월 3월 4월

Page 24: [2A3]Big Data Launching Episodes

Hive에서 Tajo/Impala로 !!

Accessibility

0

25

50

75

100

April May June July

Data  Size  /  Date

ImpalaTajo

Page 25: [2A3]Big Data Launching Episodes

4 days 4 hours

Accessibility

insert overwrite table tmp_daily_recommendation partition(silo) select x.probe_mgmt_string, x.marang_made_spring, x.marang_made_nm, x.bf_m2_series, x.chg_dev_silo, x.prob_simple, x.prob_simple_ko, x.prob_outs, x.prob_outs_ko, x.choosen, x.silo from ( select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'chg' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple <= mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '21' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11', '12')) or probe_chg_spring in ('1','2')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_simple_ko asc limit 10000 union all select a.probe_mgmt_string, e.marang_made_spring, e.marang_made_nm, e.bf_m2_series, e.chg_dev_silo, a.mouse_prob_simple as prob_simple, a.mouse_rank_simple as prob_simple_ko, a.mouse_prob_outs as prob_outs, a.mouse_rank_outs as prob_outs_ko, 'out' as choosen, trim(' 20140927 ') as silo from (select distinct probe_mgmt_string from tmp_old_report) d right outer join (select probe_mgmt_string, mouse_prob_simple, mouse_prob_outs, mouse_rank_simple, mouse_rank_outs from tmp_comp_post_prob_new where silo = trim(' 20140927 ') and pado_spring in ('1','2') and testman_sample_spring >= '021' and months >= 15 and event_type = 'event1' and mouse_rank_simple > mouse_rank_outs) a on a.probe_mgmt_string = d.probe_mgmt_string join (select s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm, round(avg(s.bf_m2_series),0) as bf_m2_series, max(case when s.prev_intro_chg_silo like '#%' then s.probe_test_silo else s.prev_intro_chg_silo end) as chg_dev_silo from (select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst where silo = trim(' 20140927 ') union all select probe_mgmt_string, marang_made_spring, marang_made_nm, prev_intro_chg_silo, probe_test_silo, bf_m2_series, bf_m3_series, bf_m4_series, pado_spring, probe_st_spring, testman_sample_spring from master_mers_mst_enc where silo = trim(' 2014-09-27 ')) s where s.pado_spring in ('1','2') and s.probe_st_spring = 'song' and s.testman_sample_spring >= '021' and (s.bf_m2_series + s.bf_m3_series + s.bf_m4_series) >= 60000 group by s.probe_mgmt_string, s.marang_made_spring, s.marang_made_nm) e on a.probe_mgmt_string = e.probe_mgmt_string left outer join (select probe_mgmt_string, min(chg_silo) as chg_silo from master_mers_evthist where chg_silo between date_add(concat(substr(trim(' 20140927 '),1,4),'-',substr(trim(' 20140927 '),5,2),'-',substr(trim(' 20140927 '),7,2)),-60) and trim(' 20140927 ') and ((probe_chg_spring in ('1','2') and probe_chg_result_spring in ('11','12')) or probe_chg_spring in ('31','32')) group by probe_mgmt_string) c on a.probe_mgmt_string = c.probe_mgmt_string where c.probe_mgmt_string is null and d.probe_mgmt_string is null order by prob_outs_ko asc limit 10000) x

2 days 2 hours

Page 26: [2A3]Big Data Launching Episodes

2. Expansion

Page 27: [2A3]Big Data Launching Episodes

Expansion

Job Detail

Current System 분석

저장

표준화 ETL, Cleansing, Lineage 등

저장 20 PB 저장 능력

공급 원활한 데이터 공급 (Real Time / Batch)

프로세싱분석 R, Python 등 분석 중심

DW/Realtime

Low Latency, Event Processing

현황Data Size (압축 / Origin)

Day 50 TB / 250 TB

Year 18.25 PB / 91.25 PB

Job Type

저장 표준화/저장/공급

프로세싱 분석/Real Time

Page 28: [2A3]Big Data Launching Episodes

Expansion

Jupiter (분석) 3 PB

Saturn (저장)!20 PB

Neptune (Real Time)

1 PB

Flume

BigBang!

Page 29: [2A3]Big Data Launching Episodes

Saturn(저장) Cluster Topology

Expansion

10G  X    22G  Bonding

.!

.!

.!

.

.!

.!

.!

.

.!

.!

.!

.!

.!

.!

.!

.!

.

Rack    awareness

4TB  X    12

40G DSAS

Page 30: [2A3]Big Data Launching Episodes

Disk Fault at Datanode

Expansion : Saturn (저장)

RoundRobin Available Space

High IO

Low IO

Eject!

Page 31: [2A3]Big Data Launching Episodes

High Temperature at Datanode

Expansion : Saturn (저장)

Disk Controller참고 쓰는 중

http://rlv.zcache.com/suppressed_laughing_yellow_smiley_face_stickers-r200e51f37ff941a38208de69f6c51657_v9waf_8byvr_512.jpg

Page 32: [2A3]Big Data Launching Episodes

Saturn(저장) Cluster Topology + Flume

Expansion : Saturn (저장) + Flume

.!

.!.!.!

.!

.!

.!

.!

.!

Flume

Dynamic Frequency Scaling

Maximum Performance

Compress & Send / 1 minute

Page 33: [2A3]Big Data Launching Episodes

Saturn(저장) Cluster Topology + Flume

Expansion : Saturn (저장) + Flume

Flume

Eventual Sending using SSD

Sending…

Sudden Fault Disk at DataNode

Page 34: [2A3]Big Data Launching Episodes

Neptune(DW) Cluster Topology

Expansion : Neptune (DW)

10G  X    220G  Bonding

.!

.!

.!

.

.!

.!

.!

.

.!

.!

.!

.!

.!

.!

.!

.!

.

Rack    awareness

1TB  X    23  SAS

40GDSAS

Page 35: [2A3]Big Data Launching Episodes

Bandwidth가 높을 때는 Network 필수적으로 점검할 사항

Expansion : Neptune (DW)

20G  Bonding

$ ifconfig!…!eth0 Link encap:Ethernet HWaddr 38:EA:A7:38:53:24 ! UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1!

RX packets:7170284853 errors:2456 dropped:25019 overruns:0 frame:2456! TX packets:31088639355 errors:0 dropped:0 overruns:0 carrier:0! collisions:0 txqueuelen:10000 ! RX bytes:41081083208513 (37.3 TiB) TX bytes:40786177694493 (37.0 TiB)!…

GBIC

$ ethtool -S eth0!…!

rx_crc_errors : 2456!…

Page 36: [2A3]Big Data Launching Episodes

Bandwidth가 높을 때는 Network 필수적으로 점검할 사항

Expansion : Neptune (DW)

https://c0da80aa54a5e1ed7d2b945327c31140a345bfe8.googledrive.com/host/0BxotWZXnwSAGSS1qRE02eWVrU28/2013-07-kernel-networking-ring-buffer.png

Page 37: [2A3]Big Data Launching Episodes

Bandwidth가 높을 때는 Network 필수적으로 점검할 사항

Expansion : Neptune (DW)

#  ethtool  -­‐g  eth0  Ring  parameters  for  eth0:  Pre-­‐set  maximums:  

RX:     4096  (최대)  RX  Mini:   0  RX  Jumbo:  0  TX:     4096  Current  hardware  settings:  RX:     512  (현재)  RX  Mini:   0  RX  Jumbo:  0  TX:     512

#  ethtool  -­‐G  eth0  rx  2048

잦은 Frame Packet Drop 발생

Page 38: [2A3]Big Data Launching Episodes

Bandwidth가 높을 때는 Network 필수적으로 점검할 사항

Expansion : Neptune (DW)

tcp_mem

socketReceive!Buffer

Send!Buffer

wmem_maxrmem_max

tcp_rmem

tcp_wmem

net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 204800 204800 16777216 net.ipv4.tcp_wmem = 204800 204800 16777216

net.ipv4.tcp_wmem 204800 204800 16777216

R S

R S

Page 39: [2A3]Big Data Launching Episodes

Expansion

Jupiter (분석) 3 PB

Saturn (저장)!20 PB

Neptune (Real Time)

1 PB

Flume

BigBang!

Page 40: [2A3]Big Data Launching Episodes

Multiple Processing Engine들의 Resrouce Mgmt를 위해

Expansion : Yarn

NodeManager

yarn.nodemanager.resource.cpu-vcores! Node Manager에서 관리하는 CPU Core수

yarn.nodemanager.resource.memory-mb! Node Manager에서 관리하는 전체 메모리

Resource Manager

yarn.scheduler.minimum-allocation-mb! 각 Node Manager에 할당할 수 있는 Container당 ! 최소 메모리

Page 41: [2A3]Big Data Launching Episodes

Multiple Processing Engine들의 Resrouce Mgmt를 위해

Expansion : Yarn

NodeManager

사용가능 Core수 : 18

전체메모리 : 3G

Resource Manager 최소메모리 : 3G

각 Node당 단 1개의 Container만 생성

Container

TaskJVM!(ex. TaskTracker 1개)Fork

이렇게 일주일 운영

Page 42: [2A3]Big Data Launching Episodes

Multiple Processing Engine들의 Resrouce Mgmt를 위해

Expansion : Yarn

NodeManager

사용가능 Core수 : 18전체메모리 : 54 G

Resource Manager 최소메모리 : 3G

각 Node당 18 개의 Container 생성 가능

Container

TaskJVM!(ex. TaskTracker 1개)Fork

Page 43: [2A3]Big Data Launching Episodes

Snappy가 좋다고 하길래

Expansion : Compress

용량이 넉넉해서 마음껏 사용!

Raw Data 250 TB/day Snappy 90 TB/day32 PB/year

용량이 부족

Snappy 90 TB/day GZip 50 TB/day

2달 걸림.

Page 44: [2A3]Big Data Launching Episodes

Automatic FailOver면 안심해도 되는줄

Expansion : NameNode HA

Zookeeper Timeout : 60초

NameNode GC하는데 3분 30초 걸림

전 Cluster 장애

Standby로 FailOver했는데, !Hadoop Client들이 원래 Active로만 연결

Zookeeper Timeout : 10분

Page 45: [2A3]Big Data Launching Episodes

MR V1 & Datanode

Expansion : Too Many CLOSE_WAIT

TaskTracker DataNode

2. block 요청

3. send block

4. close

1. connect

FIN_WAITCLOSE_WAIT2시간 내로 없어지지 않음.client socket port 고갈TT Restart

Page 46: [2A3]Big Data Launching Episodes

3. Lessons Learns

Page 47: [2A3]Big Data Launching Episodes

Lessons LearnedAccessibility 1. 누구나 쉽게 접근할 수 있어야 한다.

2. 프로그램은 할 줄 몰라도 동작원리는 알아야 한다.

3. 쉬우면 많은 사람들이 접근한다.

4. 누구나 분석가가 되어간다.

Page 48: [2A3]Big Data Launching Episodes

Lessons LearnedExpansion 1. Network는 Hadoop의 혈관과 같다.

2. Yarn은 아직 사용하기 시기 상조다. 설정 정보가 너무 많고, 상관 관계도 너무 복잡하다.

3. Hadoop 이중화는 반드시 Client도 확인해야 한다.

4. Hadoop 이중화가 그렇다고 정말 안전하지도 않다.

5. Yahoo 2,000대는 아마도 디스크가 작았던 것 같다.

6. 아직 해야할 일이 많다.

Page 49: [2A3]Big Data Launching Episodes

4. Future

Page 50: [2A3]Big Data Launching Episodes

Approximate Query Engine

select  sum(val1)  from    table  where  key=’a’  within  3  seconds

select  sum(val1)  from    table  where  key=’a’  Error  Rate  10%

Blink DB 처럼

Page 51: [2A3]Big Data Launching Episodes

Approximate Query EngineZoomable Data Navigation

select  sum(val1)  from    table  where  age                between  1  and  10  

within  1  seconds

select  sum(val1)  from    table  where  age                between  1  and  10  

within  10  seconds

Page 52: [2A3]Big Data Launching Episodes

Q&A

Page 53: [2A3]Big Data Launching Episodes

THANK YOU