martyn hadoop aws
Post on 03-Jun-2018
223 Views
Preview:
TRANSCRIPT
-
8/11/2019 Martyn Hadoop Aws
1/55
Hadoop and AWS
http://bit.ly/HvMC1Z
-
8/11/2019 Martyn Hadoop Aws
2/55
Login to your AWS account.
Select the EC2service.
-
8/11/2019 Martyn Hadoop Aws
3/55
Click on Launch Instance
-
8/11/2019 Martyn Hadoop Aws
4/55
Click Select
Click on Quick Launch Wizard
Select Ubuntu Server 12!"# L$S (a free tier instance.
-
8/11/2019 Martyn Hadoop Aws
5/55
Click on %evie& and Launch...
-
8/11/2019 Martyn Hadoop Aws
6/55
Click on Launch...
-
8/11/2019 Martyn Hadoop Aws
7/55
Select Create a ne& ke' pair fro! the top "rop#"o$n bo%...
-
8/11/2019 Martyn Hadoop Aws
8/55
&ive the keypair a na!e an" click on (o&nload )e' *air
Save the keypair $here you can fin" it' an" click on Launch Instances.
-
8/11/2019 Martyn Hadoop Aws
9/55
our instance is no$ being launche". Click on +ie& Instancesto see it.
-
8/11/2019 Martyn Hadoop Aws
10/55
)ur instance is initialise"
Click the instance (it*ll have a green light ne%t to it' to "isplay infor!ation about it.
+his $ill be i!portant in a !inute
-
8/11/2019 Martyn Hadoop Aws
11/55
-
8/11/2019 Martyn Hadoop Aws
12/55
Select the ,ava SSH Client option.-nter the path to the key pair file you "o$nloa"e"' i.e. right#click on the file if you*renot sure.
-
8/11/2019 Martyn Hadoop Aws
13/55
-
8/11/2019 Martyn Hadoop Aws
14/55
Start u++gen (Start !enu' click ll rogra!s 0 u++ 0 *u$$,-en.
Click on Load button
in" the fol"er $ith your .pe/key in.
Select All 0iles ..an" click on your 2S pe/key.
Settin- up *utt' or AWS instance connection
-
8/11/2019 Martyn Hadoop Aws
15/55
success !essage shoul" appear' no$ $e nee" to save the key in 3++*s o$n for!at.
Click on Save private ke'
Confir! you $ish to save &ithouta passphrase' an" save in the sa!e "irectory.
-
8/11/2019 Martyn Hadoop Aws
16/55
Connectin- to our instance usin- *u$$, SSH
&o to Start 0 ll rogra!s 0 u++ 0 u++ to loa" up *U$$, SSH
S$itch back to the 2S console' an" copy the a""ress of your instance' it*ll lookso!ething like ec23"21415133eu&est1co/putea/azona&sco/
+his is the a""ress of the instance that $e*ll be using to connect to.
-
8/11/2019 Martyn Hadoop Aws
17/55
aste the a""ress here
-
8/11/2019 Martyn Hadoop Aws
18/55
Scroll "o$n an"click on Auth
-
8/11/2019 Martyn Hadoop Aws
19/55
4o$ click on 6ro&sean" navigate to the key you 5ust save" (en"s $ith * ppk* e%tension.
4o$ click on 7pen
Click on 'es $hen the security alert appears.
-
8/11/2019 Martyn Hadoop Aws
20/55
+ype ubuntuas the login na!e an" press Enterkey
2e "on*t nee" a pass$or" as our key $ill be sent across to the instance.
-
8/11/2019 Martyn Hadoop Aws
21/55
Success6 2e*re no$ logge" in to our Ubuntuinstance
-
8/11/2019 Martyn Hadoop Aws
22/55
Installin- 8ava9
2hilst in the ter!inal enter the follo$ing co!!an"s:
su"o apt#get up"ate su"o apt#get install open5"k#7#5re
-
8/11/2019 Martyn Hadoop Aws
23/55
Installin- Hadoop9
&et the file fro! e%ternal site:
$get http://archive.apache.org/"ist/ha"oop/core/ha"oop#8.99.8/ha"oop#8.99.8.tar.g
3npack it:
tar %f ha"oop#8.99.8.tar.g
Copy it to so!e$here !ore sensible like our local user "irectory:
su"o cp rha"oop#;/ /usr/local
+here*s a space here
-
8/11/2019 Martyn Hadoop Aws
24/55
-"it the ter!inal script:
nano H)M-?usr/ e%port H@))>H)M-?usr/local/ha"oop#8.99.8
Save the file (ctrl#% an" type *y*:
"" it to the ter!inal environ!ent:
source
-
8/11/2019 Martyn Hadoop Aws
25/55
Let*s !ove in to the !ain "irectory of the application:
c" /usr/local/ha"oop#;
4o$ e"it Ha"oop*s set up script:
su"o nano conf/ha"oop#env.sh
Save(ctrl#%' then type *y*
e%port ,=>H)M-?/usr
-
8/11/2019 Martyn Hadoop Aws
26/55
"" the configuration file to the ter!inals scope:
source conf/ha"oop#env.sh
%unnin- an e:a/ple usin- Sin-le node /ode9
Calculating A:
su"o bin;hadoop
-
8/11/2019 Martyn Hadoop Aws
27/55
Another e:a/ple= usin- so/e actual data
Create a "irectory to put our "ata in:
su"o !k"ir input
Copy the very interesting B-@M-.t%t file to our ne$ input fol"er:
su"o cp B-@M-.t%t LAC-4S-.t%t input
4o$ $e count up the total $or"s an" $hat they are (Ha"oop $ill create the outputfol"er for us:
su"o bin;hadoop
-
8/11/2019 Martyn Hadoop Aws
28/55
What>s happenin-?
-
8/11/2019 Martyn Hadoop Aws
29/55
public static class +okenierMappere%ten"s Mapper)b5ect' +e%t' +e%t' Ant2ritable0D
private final static Ant2ritable one ? ne$ Ant2ritable(1E
private +e%t $or" ? ne$ +e%t(E
public voi" /ap()b5ect key' +e%t value' Conte%t conte%t thro$s A)-%ception' Anterrupte"-%ception D String+okenier itr ? ne$ String+okenier(value.toString(E $hile (itr.hasMore+okens( D $or".set(itr.ne%t+oken(E conte%t.$rite($or"' oneE F
F F
$he @apper9 splits up the &ords
-
8/11/2019 Martyn Hadoop Aws
30/55
public static class AntSu!Be"ucere%ten"s Be"ucer+e%t'Ant2ritable'+e%t'Ant2ritable0 D private Ant2ritable result ? ne$ Ant2ritable(E
public voi" reduce(+e%t key' AterableAnt2ritable0 values'Conte%t conte%t
thro$s A)-%ception' Anterrupte"-%ception D
int su! ? 8E for (Ant2ritable val : values D su! G? val.get(E F result.set(su!E conte%t.$rite(key' resultE F
F
$he %educer9 takes the input o &ord= countB and su/s up thecounts
@ i d i th @ % d < b
-
8/11/2019 Martyn Hadoop Aws
31/55
public static voi" !ain(StringI args thro$s -%ception D Configuration conf ? ne$ Configuration(E StringI otherrgs ? ne$ &eneric)ptionsarser(conf' args.getBe!ainingrgs(E if (otherrgs.length 6? 9 D Syste!.err.println(J3sage: $or"count in0 out0JE Syste!.e%it(9E F
,ob 5ob ? ne$ ,ob(conf' J$or" countJE 5ob.set,arKyClass(2or"Count.classE 5ob.set@apperClass(+okenierMapper.classE 5ob.setCo!binerClass(AntSu!Be"ucer.classE 5ob.set%educerClass(AntSu!Be"ucer.classE 5ob.set)utputeyClass(+e%t.classE 5ob.set)utput=alueClass(Ant2ritable.classE
ileAnputor!at.a""Anputath(5ob' ne$ ath(otherrgs8IE ile)utputor!at.set)utputath(5ob' ne$ ath(otherrgs1IE Syste!.e%it(5ob.$aitorCo!pletion(true 8 : 1E F
@ain runs and coni-ures the @ap%educe
-
8/11/2019 Martyn Hadoop Aws
32/55
)ne last e%a!ple' this ti!e using 2S to create the Ha"oop cluster
for us.
2e $ill use the $or" count e%a!ple use" previously.
Hadoop in the AWS Cloud
-
8/11/2019 Martyn Hadoop Aws
33/55
irst $e nee" a place to put the "ata after it has been pro"uce"...
A/azon S#(Si!ple Storage Service:
n online storage $eb service provi"ing storage through $ebservices interfaces (B-S+' S)' an" Kit+orrent.
-
8/11/2019 Martyn Hadoop Aws
34/55
Select S#fro! the console
-
8/11/2019 Martyn Hadoop Aws
35/55
-
8/11/2019 Martyn Hadoop Aws
36/55
&ive it a na!e. (not @'6ucket
# so!ething uniNue' also 4)CA+L L-++-BS.
Choose Irelandfro! the
region list (it*s closer' so less latency.
-
8/11/2019 Martyn Hadoop Aws
37/55
our ne$ bucket
%unnin- a @ap %educe pro-ra/ in AWS
-
8/11/2019 Martyn Hadoop Aws
38/55
- p p -
Select Elastic@ap%educein 2S console
-
8/11/2019 Martyn Hadoop Aws
39/55
Select Create e& 8ob 0lo&
S l t % l li ti
-
8/11/2019 Martyn Hadoop Aws
40/55
Select %un a sa/ple applicationChoose the Word Counte%a!ple fro! the "rop "o$n !enu
-
8/11/2019 Martyn Hadoop Aws
41/55
Beplace 'our bucketB$ith the na!e of the S#bucket $e 5ust create":
-
8/11/2019 Martyn Hadoop Aws
42/55
4e%t' specify ho$ !any instances you $ant O 5ust leave it at t$o forno$ (the !ore instances the !ore PPP it $ill be to run your 5ob.
-
8/11/2019 Martyn Hadoop Aws
43/55
Input data9
eu#$est#1.elastic!apre"uce/sa!ples/$or"count/input
7utput data9
+his is going to be store" on our S#bucket...
sQn://laz'eels/$or"count/output/981Q#11#81
+o"ays "ate
-
8/11/2019 Martyn Hadoop Aws
44/55
Select your keypair
-
8/11/2019 Martyn Hadoop Aws
45/55
-
8/11/2019 Martyn Hadoop Aws
46/55
-
8/11/2019 Martyn Hadoop Aws
47/55
) it* " b k t th AWS l
-
8/11/2019 Martyn Hadoop Aws
48/55
)nce it*s "one go back to theAWS console
Select S#
-
8/11/2019 Martyn Hadoop Aws
49/55
Select your S#bucket.
Select %ereshfro! the Actions!enu.
0indin- 'our output
+he results have been $ritten to the output fol"er in parts (H@S for!at
-
8/11/2019 Martyn Hadoop Aws
50/55
+he results have been $ritten to the output fol"er in parts (H@S for!at.
@ouble#click to "o$nloa".
)pen in a te%t e"itor (notepa"' ge"it.
-
8/11/2019 Martyn Hadoop Aws
51/55
ou can "elete the results by right#clicking
on the fol"er an" selecting delete.
So/e notes
!aon charges for storage so this is $orth"oing if you no longer nee" it.
Ha"oop $ill fail if it fin"s a fol"er $ith thesa!e na!e $hen it $rites the output.
+he S#bucket is $here you $oul" uploa"your .
-
8/11/2019 Martyn Hadoop Aws
52/55
Shuttin- do&n 'our instance
!aon charges by the hour' so !ake sure you close your instance after each session.
Select the instance that is running through EC2option in the AWS console
Bight#click an" select +er/inateto kill the instance
So/e tips9
-
8/11/2019 Martyn Hadoop Aws
53/55
Hadoop is not "esigne" to run on 2in"o$s Consi"er using C'-&in+irtualbo: D
https9;;&&&virtualbo:or-' or installing Linu: @int Dhttp9;;&&&linu:/intco/;alongsi"e your
2in"o$s install (at ho!e.
Stick to earlier versions of Ha"oop such as !22!(they keep !oving things aroun"' especially the class
files that you*ll nee" to co!pile your co"e to
-
8/11/2019 Martyn Hadoop Aws
54/55
Fet in the habit o stoppin- 'our instances &hen 'ou>re inishedG
Hadoop in Action is your frien" (if you*re using 8ava' consi"er getting a copy.
Chapter 2
Sho$s you ho$ to set everything up fro! scratch.
Chapter #
rovi"es so!e goo" te!plates to base your co"e on.
Chapter "
@iscusses issues you !ay encounter $ith the "ifferent A versions.
Chapter
+ells you ho$ to launch your MapBe"uce progra!s fro! the co!!an" line an" 2S console' as $ell asusing SQ buckets for "ata storage an" ho$ to access it.
So/e useul links
-
8/11/2019 Martyn Hadoop Aws
55/55
Installin- and usa-e9
http://$$$.higherpass.co!/linu%/+utorials/Anstalling#n"#3sing#Ha"oop/
%unnin- a
top related