martyn hadoop aws

Upload: nguyen-ba-nhiem

Post on 03-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 Martyn Hadoop Aws

    1/55

    Hadoop and AWS

    http://bit.ly/HvMC1Z

  • 8/11/2019 Martyn Hadoop Aws

    2/55

    Login to your AWS account.

    Select the EC2service.

  • 8/11/2019 Martyn Hadoop Aws

    3/55

    Click on Launch Instance

  • 8/11/2019 Martyn Hadoop Aws

    4/55

    Click Select

    Click on Quick Launch Wizard

    Select Ubuntu Server 12!"# L$S (a free tier instance.

  • 8/11/2019 Martyn Hadoop Aws

    5/55

    Click on %evie& and Launch...

  • 8/11/2019 Martyn Hadoop Aws

    6/55

    Click on Launch...

  • 8/11/2019 Martyn Hadoop Aws

    7/55

    Select Create a ne& ke' pair fro! the top "rop#"o$n bo%...

  • 8/11/2019 Martyn Hadoop Aws

    8/55

    &ive the keypair a na!e an" click on (o&nload )e' *air

    Save the keypair $here you can fin" it' an" click on Launch Instances.

  • 8/11/2019 Martyn Hadoop Aws

    9/55

    our instance is no$ being launche". Click on +ie& Instancesto see it.

  • 8/11/2019 Martyn Hadoop Aws

    10/55

    )ur instance is initialise"

    Click the instance (it*ll have a green light ne%t to it' to "isplay infor!ation about it.

    +his $ill be i!portant in a !inute

  • 8/11/2019 Martyn Hadoop Aws

    11/55

  • 8/11/2019 Martyn Hadoop Aws

    12/55

    Select the ,ava SSH Client option.-nter the path to the key pair file you "o$nloa"e"' i.e. right#click on the file if you*renot sure.

  • 8/11/2019 Martyn Hadoop Aws

    13/55

  • 8/11/2019 Martyn Hadoop Aws

    14/55

    Start u++gen (Start !enu' click ll rogra!s 0 u++ 0 *u$$,-en.

    Click on Load button

    in" the fol"er $ith your .pe/key in.

    Select All 0iles ..an" click on your 2S pe/key.

    Settin- up *utt' or AWS instance connection

  • 8/11/2019 Martyn Hadoop Aws

    15/55

    success !essage shoul" appear' no$ $e nee" to save the key in 3++*s o$n for!at.

    Click on Save private ke'

    Confir! you $ish to save &ithouta passphrase' an" save in the sa!e "irectory.

  • 8/11/2019 Martyn Hadoop Aws

    16/55

    Connectin- to our instance usin- *u$$, SSH

    &o to Start 0 ll rogra!s 0 u++ 0 u++ to loa" up *U$$, SSH

    S$itch back to the 2S console' an" copy the a""ress of your instance' it*ll lookso!ething like ec23"21415133eu&est1co/putea/azona&sco/

    +his is the a""ress of the instance that $e*ll be using to connect to.

  • 8/11/2019 Martyn Hadoop Aws

    17/55

    aste the a""ress here

  • 8/11/2019 Martyn Hadoop Aws

    18/55

    Scroll "o$n an"click on Auth

  • 8/11/2019 Martyn Hadoop Aws

    19/55

    4o$ click on 6ro&sean" navigate to the key you 5ust save" (en"s $ith * ppk* e%tension.

    4o$ click on 7pen

    Click on 'es $hen the security alert appears.

  • 8/11/2019 Martyn Hadoop Aws

    20/55

    +ype ubuntuas the login na!e an" press Enterkey

    2e "on*t nee" a pass$or" as our key $ill be sent across to the instance.

  • 8/11/2019 Martyn Hadoop Aws

    21/55

    Success6 2e*re no$ logge" in to our Ubuntuinstance

  • 8/11/2019 Martyn Hadoop Aws

    22/55

    Installin- 8ava9

    2hilst in the ter!inal enter the follo$ing co!!an"s:

    su"o apt#get up"ate su"o apt#get install open5"k#7#5re

  • 8/11/2019 Martyn Hadoop Aws

    23/55

    Installin- Hadoop9

    &et the file fro! e%ternal site:

    $get http://archive.apache.org/"ist/ha"oop/core/ha"oop#8.99.8/ha"oop#8.99.8.tar.g

    3npack it:

    tar %f ha"oop#8.99.8.tar.g

    Copy it to so!e$here !ore sensible like our local user "irectory:

    su"o cp rha"oop#;/ /usr/local

    +here*s a space here

  • 8/11/2019 Martyn Hadoop Aws

    24/55

    -"it the ter!inal script:

    nano H)M-?usr/ e%port H@))>H)M-?usr/local/ha"oop#8.99.8

    Save the file (ctrl#% an" type *y*:

    "" it to the ter!inal environ!ent:

    source

  • 8/11/2019 Martyn Hadoop Aws

    25/55

    Let*s !ove in to the !ain "irectory of the application:

    c" /usr/local/ha"oop#;

    4o$ e"it Ha"oop*s set up script:

    su"o nano conf/ha"oop#env.sh

    Save(ctrl#%' then type *y*

    e%port ,=>H)M-?/usr

  • 8/11/2019 Martyn Hadoop Aws

    26/55

    "" the configuration file to the ter!inals scope:

    source conf/ha"oop#env.sh

    %unnin- an e:a/ple usin- Sin-le node /ode9

    Calculating A:

    su"o bin;hadoop

  • 8/11/2019 Martyn Hadoop Aws

    27/55

    Another e:a/ple= usin- so/e actual data

    Create a "irectory to put our "ata in:

    su"o !k"ir input

    Copy the very interesting [email protected]%t file to our ne$ input fol"er:

    su"o cp [email protected]%t LAC-4S-.t%t input

    4o$ $e count up the total $or"s an" $hat they are (Ha"oop $ill create the outputfol"er for us:

    su"o bin;hadoop

  • 8/11/2019 Martyn Hadoop Aws

    28/55

    What>s happenin-?

  • 8/11/2019 Martyn Hadoop Aws

    29/55

    public static class +okenierMappere%ten"s Mapper)b5ect' +e%t' +e%t' Ant2ritable0D

    private final static Ant2ritable one ? ne$ Ant2ritable(1E

    private +e%t $or" ? ne$ +e%t(E

    public voi" /ap()b5ect key' +e%t value' Conte%t conte%t thro$s A)-%ception' Anterrupte"-%ception D String+okenier itr ? ne$ String+okenier(value.toString(E $hile (itr.hasMore+okens( D $or".set(itr.ne%t+oken(E conte%t.$rite($or"' oneE F

    F F

    $he @apper9 splits up the &ords

  • 8/11/2019 Martyn Hadoop Aws

    30/55

    public static class AntSu!Be"ucere%ten"s Be"ucer+e%t'Ant2ritable'+e%t'Ant2ritable0 D private Ant2ritable result ? ne$ Ant2ritable(E

    public voi" reduce(+e%t key' AterableAnt2ritable0 values'Conte%t conte%t

    thro$s A)-%ception' Anterrupte"-%ception D

    int su! ? 8E for (Ant2ritable val : values D su! G? val.get(E F result.set(su!E conte%t.$rite(key' resultE F

    F

    $he %educer9 takes the input o &ord= countB and su/s up thecounts

    @ i d i th @ % d < b

  • 8/11/2019 Martyn Hadoop Aws

    31/55

    public static voi" !ain(StringI args thro$s -%ception D Configuration conf ? ne$ Configuration(E StringI otherrgs ? ne$ &eneric)ptionsarser(conf' args.getBe!ainingrgs(E if (otherrgs.length 6? 9 D Syste!.err.println(J3sage: $or"count in0 out0JE Syste!.e%it(9E F

    ,ob 5ob ? ne$ ,ob(conf' J$or" countJE 5ob.set,arKyClass(2or"Count.classE 5ob.set@apperClass(+okenierMapper.classE 5ob.setCo!binerClass(AntSu!Be"ucer.classE 5ob.set%educerClass(AntSu!Be"ucer.classE 5ob.set)utputeyClass(+e%t.classE 5ob.set)utput=alueClass(Ant2ritable.classE

    ileAnputor!at.a""Anputath(5ob' ne$ ath(otherrgs8IE ile)utputor!at.set)utputath(5ob' ne$ ath(otherrgs1IE Syste!.e%it(5ob.$aitorCo!pletion(true 8 : 1E F

    @ain runs and coni-ures the @ap%educe

  • 8/11/2019 Martyn Hadoop Aws

    32/55

    )ne last e%a!ple' this ti!e using 2S to create the Ha"oop cluster

    for us.

    2e $ill use the $or" count e%a!ple use" previously.

    Hadoop in the AWS Cloud

  • 8/11/2019 Martyn Hadoop Aws

    33/55

    irst $e nee" a place to put the "ata after it has been pro"uce"...

    A/azon S#(Si!ple Storage Service:

    n online storage $eb service provi"ing storage through $ebservices interfaces (B-S+' S)' an" Kit+orrent.

  • 8/11/2019 Martyn Hadoop Aws

    34/55

    Select S#fro! the console

  • 8/11/2019 Martyn Hadoop Aws

    35/55

  • 8/11/2019 Martyn Hadoop Aws

    36/55

    &ive it a na!e. (not @'6ucket

    # so!ething uniNue' also 4)CA+L L-++-BS.

    Choose Irelandfro! the

    region list (it*s closer' so less latency.

  • 8/11/2019 Martyn Hadoop Aws

    37/55

    our ne$ bucket

    %unnin- a @ap %educe pro-ra/ in AWS

  • 8/11/2019 Martyn Hadoop Aws

    38/55

    - p p -

    Select Elastic@ap%educein 2S console

  • 8/11/2019 Martyn Hadoop Aws

    39/55

    Select Create e& 8ob 0lo&

    S l t % l li ti

  • 8/11/2019 Martyn Hadoop Aws

    40/55

    Select %un a sa/ple applicationChoose the Word Counte%a!ple fro! the "rop "o$n !enu

  • 8/11/2019 Martyn Hadoop Aws

    41/55

    Beplace 'our bucketB$ith the na!e of the S#bucket $e 5ust create":

  • 8/11/2019 Martyn Hadoop Aws

    42/55

    4e%t' specify ho$ !any instances you $ant O 5ust leave it at t$o forno$ (the !ore instances the !ore PPP it $ill be to run your 5ob.

  • 8/11/2019 Martyn Hadoop Aws

    43/55

    Input data9

    eu#$est#1.elastic!apre"uce/sa!ples/$or"count/input

    7utput data9

    +his is going to be store" on our S#bucket...

    sQn://laz'eels/$or"count/output/981Q#11#81

    +o"ays "ate

  • 8/11/2019 Martyn Hadoop Aws

    44/55

    Select your keypair

  • 8/11/2019 Martyn Hadoop Aws

    45/55

  • 8/11/2019 Martyn Hadoop Aws

    46/55

  • 8/11/2019 Martyn Hadoop Aws

    47/55

    ) it* " b k t th AWS l

  • 8/11/2019 Martyn Hadoop Aws

    48/55

    )nce it*s "one go back to theAWS console

    Select S#

  • 8/11/2019 Martyn Hadoop Aws

    49/55

    Select your S#bucket.

    Select %ereshfro! the Actions!enu.

    0indin- 'our output

    +he results have been $ritten to the output fol"er in parts (H@S for!at

  • 8/11/2019 Martyn Hadoop Aws

    50/55

    +he results have been $ritten to the output fol"er in parts (H@S for!at.

    @ouble#click to "o$nloa".

    )pen in a te%t e"itor (notepa"' ge"it.

  • 8/11/2019 Martyn Hadoop Aws

    51/55

    ou can "elete the results by right#clicking

    on the fol"er an" selecting delete.

    So/e notes

    !aon charges for storage so this is $orth"oing if you no longer nee" it.

    Ha"oop $ill fail if it fin"s a fol"er $ith thesa!e na!e $hen it $rites the output.

    +he S#bucket is $here you $oul" uploa"your .

  • 8/11/2019 Martyn Hadoop Aws

    52/55

    Shuttin- do&n 'our instance

    !aon charges by the hour' so !ake sure you close your instance after each session.

    Select the instance that is running through EC2option in the AWS console

    Bight#click an" select +er/inateto kill the instance

    So/e tips9

  • 8/11/2019 Martyn Hadoop Aws

    53/55

    Hadoop is not "esigne" to run on 2in"o$s Consi"er using C'-&in+irtualbo: D

    https9;;&&&virtualbo:or-' or installing Linu: @int Dhttp9;;&&&linu:/intco/;alongsi"e your

    2in"o$s install (at ho!e.

    Stick to earlier versions of Ha"oop such as !22!(they keep !oving things aroun"' especially the class

    files that you*ll nee" to co!pile your co"e to

  • 8/11/2019 Martyn Hadoop Aws

    54/55

    Fet in the habit o stoppin- 'our instances &hen 'ou>re inishedG

    Hadoop in Action is your frien" (if you*re using 8ava' consi"er getting a copy.

    Chapter 2

    Sho$s you ho$ to set everything up fro! scratch.

    Chapter #

    rovi"es so!e goo" te!plates to base your co"e on.

    Chapter "

    @iscusses issues you !ay encounter $ith the "ifferent A versions.

    Chapter

    +ells you ho$ to launch your MapBe"uce progra!s fro! the co!!an" line an" 2S console' as $ell asusing SQ buckets for "ata storage an" ho$ to access it.

    So/e useul links

  • 8/11/2019 Martyn Hadoop Aws

    55/55

    Installin- and usa-e9

    http://$$$.higherpass.co!/linu%/+utorials/Anstalling#n"#3sing#Ha"oop/

    %unnin- a