building apps with hbase - big data techcon boston
TRANSCRIPT
1
Headline Goes HereSpeaker Name or Subhead Goes Here
Building applica2ons with HBaseAmandeep Khurana | Solu7ons ArchitectBig Data TechCon, Boston, April 2013
Friday, April 12, 13
About me
•Solu2ons Architect, Cloudera Inc•Amazon Web Services•Interested in large scale distributed systems•Co-‐author, HBase In Ac2on•TwiHer: amansk
2
M A N N I N G
Nick DimidukAmandeep Khurana
Friday, April 12, 13
About the talk
•Setup local HBase environment•Simple client•Deeper dive on API•Data model
3Friday, April 12, 13
What is HBase?
•Scalable, distributed data store•Open source avatar of Google’s Bigtable•Sorted map of maps•Sparse•Mul2 dimensional•Tightly integrated with Hadoop•Not a RDBMS
4Friday, April 12, 13
So, what’s the big deal?
•Give up system complexity and sophis2cated features for ease of scale and manageability
•Simple system != easy to build applica2ons on top•Goal: predictable read and write performance at scale
•Effec2ve API usage•Op2mal schema design•Understand internals and leverage at scale
5Friday, April 12, 13
Use cases
•Canonical use case: storing crawl data and indices for search
6
1
1 Crawlers constantly scour the Internet for new pages.Those pages are stored as individual records in Bigtable. 3
2 A MapReduce job runs over the entire table, generatingsearch indexes for the Web Search application.
4
2
5
Web Searchpowered by Bigtable
Indexing the Internet
3 The user initiates a Web Search request.
4 The Web Search application queries the Search Indexesand retries matching documents directly from Bigtable.
5 Search results are presented to the user.
Searching the Internet
BigtableInternetsCrawlers
CrawlersCrawlers
Crawlers
MapReduce
You
Search IndexSearch
IndexSearch Index
Web Search
Friday, April 12, 13
Use cases (con2nued)
•Capturing incremental trickle data•Content serving / OLTP•Blob store
7Friday, April 12, 13
Installing HBase
•Get it here -‐> hHp://hbase.apache.org/•Downloads -‐> Choose a mirror -‐> Choose a release•Download the tarball•Untar it and run it
8Friday, April 12, 13
Basics
•Modes of opera2on•Standalone•Pseudo-‐distributed•Fully distributed
•Start up in standalone mode•cd hbase-‐0.94.1•bin/start-‐hbase.sh•Logs are in ./logs/
•Check out hHp://localhost:60010 in your browser
9Friday, April 12, 13
Ways to interact
•Shell•Java client•Thrid interface•REST interface•Asynchbase
10Friday, April 12, 13
Shell
•Basic interac2on tool•WriHen in JRuby•Uses the Java client library•Good place to start poking around
11Friday, April 12, 13
Shell (con2nued)
•bin/hbase shell•Create table
•create ‘users’ , ‘info’•List tables
•list•Describe table
•describe ‘users’
12Friday, April 12, 13
Shell (con2nued)
•Put a row•put ‘users’ , ‘u1’, ‘info:name’ , ‘Mr. Rogers’
•Get a row•get ‘users’ , ‘u1’
•Put more•put ‘users’ , ‘u1’ , ‘info:email’ , ‘[email protected]’•put ‘users’ , ‘u1’ , ‘info:password’ , ‘foobar’
•Get a row•get ‘users’ , ‘u1’
•Scan table• scan ‘users’
13Friday, April 12, 13
14
Demo-‐ Standalone mode-‐ Shell
Friday, April 12, 13
15
... can’t really build an applica7on just with the shell
Java API
Friday, April 12, 13
Important terms
• Table• Consists of rows and columns
• Row• Has a bunch of columns.• Iden2fied by a rowkey (primary key)
• Column Qualifier• Dynamic column name
• Column Family• Column groups -‐ logical and physical (Similar access paHern)
• Cell• The actual element that contains the data for a row-‐column intersec2on
• Version• Every cell has mul2ple versions.
16Friday, April 12, 13
Introducing TwitBase
•TwiHer clone built on HBase•Designed to scale• It’s an example, not a produc2on app•Start with users and twits for now•Users
•Have profiles•Post twits
•Twits•Have text posted by users
•Get sample code at: hHps://github.com/hbaseinac2on/twitbase
17Friday, April 12, 13
Java API
•Configura)on -‐ holds config informa2on about the cluster. Roughly equivalent to JDBC connec2on string
•HConnec)on -‐ represents connec2on to a cluster•HBaseAdmin -‐ admin tool to handle DDL opera2ons
•HTablePool -‐ pool of connec2ons to cluster•HTable (HTableInterface) -‐ handle on a single HBase table. Sends commands to table
18Friday, April 12, 13
Connec2ng to HBase
•Using default confs:HTableInterface usersTable = new HTable("users");
•Passing custom conf:Configuration myConf = HBaseConfiguration.create();myConf.set("parameter_name", "parameter_value");HTableInterface usersTable = new HTable(myConf, "users");
•Connection poolingHTablePool pool = new HTablePool();HTableInterface usersTable = pool.getTable("users"); // work with the tableusersTable.close();
19Friday, April 12, 13
Inser2ng data
•Put call on HTablePut p = new Put(Bytes.toBytes("TheRealMT")); p.add(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("Mark Twain")); p.add(Bytes.toBytes("info"), Bytes.toBytes("email"), Bytes.toBytes("[email protected]")); p.add(Bytes.toBytes("info"), Bytes.toBytes("password"), Bytes.toBytes("Langhorne"));
usersTable.put(p);
20Friday, April 12, 13
Data coordinates
•Row is addressed using rowkey•Cell is addressed using [rowkey + family + qualifier]
21Friday, April 12, 13
Upda2ng data
•Update == WritePut p = new Put(Bytes.toBytes("TheRealMT"));
p.add(Bytes.toBytes("info"),
Bytes.toBytes("password"),
Bytes.toBytes("abc123"));
usersTable.put(p);
22Friday, April 12, 13
Write path
23
Memstore
HFile
HFile
WAL
Write goes to the write-ahead log (WAL)as well as the in memory write buffer called
the Memstore.Client's don't interact directly with the
underlying HFiles during writes.Client
Memstore
New HFile
HFile
Memstore gets flushed to disk toform a new HFile
when filled up with writes.
HFile
Friday, April 12, 13
Reading data
•Get the en2re rowGet g = new Get(Bytes.toBytes("TheRealMT"));
Result r = usersTable.get(g);
•Get selected columnsGet g = new Get(Bytes.toBytes("TheRealMT"));
g.addColumn(Bytes.toBytes("info"),
Bytes.toBytes("password"));
Result r = usersTable.get(g);
•Extract the value out of the Result objectbyte[] b = r.getValue(Bytes.toBytes("info"),
Bytes.toBytes("email"));
String email = Bytes.toString(b);
24Friday, April 12, 13
Reading data (con2nued)
•ScanScan s = new Scan(startRow, stopRow);
ResultsScanner rs = twits.getScanner(s);
for(Result r : rs) {
// iterate over scanner results
}
•Single threaded iteration over defined set of rows. Could be the entire table too.
25Friday, April 12, 13
What happens when you read data?
26
Memstore
HFile
HFile
Client
BlockCache
Friday, April 12, 13
Dele2ng data
•Another simple API call•Delete en2re rowDelete d = new Delete(Bytes.toBytes("TheRealMT"));
usersTable.delete(d);
•Delete selected columnsDelete d = new Delete(Bytes.toBytes("TheRealMT"));
d.deleteColumns(Bytes.toBytes("info"),
Bytes.toBytes("email"));
usersTable.delete(d);
27Friday, April 12, 13
Delete internals
•Tombstone markers•Dele2on only happens during major compac2ons•Compac2ons
•Minor•Major
28Friday, April 12, 13
Compac2ons
29
Memstore
HFile
HFile
Memstore
CompactedHFile
Two or more HFiles get combinedto form a single HFile during
a compaction.
HFile HFile
Friday, April 12, 13
DAO
•Like any other applica2on•Makes data access management easy
•Cleaner code•Table info at a single place
•Op2miza2ons like connec2on pooling etc•Easier management of database configs
30Friday, April 12, 13
AdminHBaseAdmin
• Administra2ve API to the cluster.
• Provides an interface to manage HBase database tables metadata and general administra2ve func2ons.
Table Opera2ons
• createTable()
• deleteTable()
• addColumn()
• modifyColumn()
• deleteColumn()
• listTables()
Friday, April 12, 13
32
Demo-‐ Java client
Friday, April 12, 13
More API features
•Filters•Compare and Swap•Coprocessors
33Friday, April 12, 13
34
It’s not a rela7onal database system
Data Model
Friday, April 12, 13
HBase is ...
•Column family oriented database•Column family oriented•Tables consis2ng of rows and columns
•Persisted Map•Sparse•Mul2 dimensional•Sorted• Indexed by rowkey, column and 2mestamp
•Key Value store• [rowkey, col family, col qualifier, 2mestamp] -‐> cell value
35Friday, April 12, 13
HBase is not ...
•A rela2onal database•No SQL query language•No joins•No secondary indexing•No transac2ons
36Friday, April 12, 13
Tabular representa2on
37
Column Family - Info
password
abc123
abc123
abc123
TheRealMT
HMS_Surprise
Sir Arthur Conan Doyle [email protected]
SirDoyle
GrandpaD
Fyodor Dostoyevsky
Patrick O'Brien
Mark Twain
Rowkey name email
Cells
Each cell has multiple versions,
typically represented by the timestamp
of when they were inserted into the table
(ts2>ts1)
ts1=1329088321289 ts2=1329088818321
The table is lexicographicallysorted on the rowkeys
Langhorneabc123
12
3
4
The coordinates used to identify data in an HBase table are:(1) rowkey, (2) column family, (3) column qualifier, (4) version
Friday, April 12, 13
Key-‐Value store
38
[TheRealMT, info, password, 1329088818321]
[TheRealMT, info, password, 1329088321289]
abc123
Langhorne
Keys Values
A single KeyValue instance
Friday, April 12, 13
Key-‐Value store
39
[TheRealMT, info, password, 1329088818321] abc123
[TheRealMT, info, password] 1329088818321 : "abc123",1329088321289 : "Langhorne"
}
{
[TheRealMT, info] },
"name" : { 1329088321289 : "Mark Twain"
"email" : { 1329088321289 : "[email protected]" },
"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } }
{
},
"info" : {
"name" : { 1329088321289 : "Mark Twain"
"email" : { 1329088321289 : "[email protected]" },
"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } } }
{
[TheRealMT]
Keys
Values
Start with coordinates of full precision1
Drop version and you're left with a map of version to values2
Omit qualifier and you have a map of qualifiers to the previous maps3
Finally, drop the column family and you have a row, a map of maps4
Friday, April 12, 13
Sorted map of maps
40
Rowkey
Column family
Column qualifiers
Versions
Values
{
},
"TheRealMT" : { "info" : {
"name" : { 1329088321289 : "Mark Twain"
"email" : { 1329088321289 : "[email protected]" },
"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } } }}
Friday, April 12, 13
HFiles and physical data model
•HFiles are•Immutable•Sorted on rowkey + qualifier + 2mestamp•In the context of a column family per region
41
, , , "TheRealMT" "info" "password" , 1329088321289 "Langhorne", , , "TheRealMT" "info" "password" , 1329088818321 "abc123",
, , , , "TheRealMT" "info" "email" 1329088321289 "[email protected]", , , , "TheRealMT" "info" "name" 1329088321289 "Mark Twain"
HFile for the info column family in the users table
Friday, April 12, 13
42
... what really goes on under the hood
HBase in distributed mode
Friday, April 12, 13
Distributed mode
•Table is split into Regions
43
00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298
00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988
00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432
00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298
00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988
00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432
00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298
Full table T1 Table T1 split into 3 regions - R1, R2 and R3
T1R1
T1R2
T1R3
Friday, April 12, 13
Distributed mode (con2nued)
•Regions are served by RegionServers
•RegionServers are Java processes, typically collocated with the DataNodes
44
DataNode RegionServer
DataNode RegionServer
DataNode RegionServer
00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988
00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432
00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298
Host1
Host2
Host3
T1R1
T1R2
T1R3
Friday, April 12, 13
Finding your region
•Two special tables: -‐ROOT-‐ and .META.
45
-ROOT-
M1 M2 M3
-ROOT- table, consisting of only 1 regionhosted on region server RS1
.META. table, consisting of 3 regions,hosted on RS1, RS2 and RS3
User tables T1 and T2, consisting of4 and 3 regions respectively,
distributed amongst RS1, RS2 and RS3
RS1
RS2
T2R1
RS2
T2R2
RS2
T2R3
RS1
RS1RS3
T1R1
RS3
T1R3
RS3
T1R2
RS1
T2R4
RS1
Friday, April 12, 13
Regions distributed across the cluster
46
DataNode RegionServer
00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988
00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432
00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298
Host1
Host2
M:T100001-M:T100009 M1 RS2M:T100010-M:T100012 M2 RS1
T1:00001-T1:00004 R1 RS1T1:00005-T1:00008 R2 RS2
T1:00009-T1:00012 R3 RS2
T1-R1
T1-R2
T1-R3
.META. - M1
-ROOT-
.META. - M2
RegionServer 1 (RS1), hosting regions R1 of the user table T1 and M2 of .META.
RegionServer 2 (RS2), hosting regions R2, R3 of the user table and M1 of .META.
RegionServer 3 (RS3), hosting just -ROOT-
Host3
DataNode RegionServer
DataNode RegionServer
Friday, April 12, 13
What the client does
47
ZooKeeper
TheRealMT
-ROOT-
.META.
MyTable
Friday, April 12, 13
What the client does
47
ZooKeeper
TheRealMT
-ROOT-
.META.
MyTable
Friday, April 12, 13
What the client does
47
ZooKeeper
TheRealMT
-ROOT-
.META.
MyTable
Row per META region
Friday, April 12, 13
What the client does
47
ZooKeeper
TheRealMT
-ROOT-
.META.
MyTable
Row per META region
Row per table region
Friday, April 12, 13
What the client does
47
ZooKeeper
TheRealMT
-ROOT-
.META.
MyTable
Row per META region
Row per table region
Friday, April 12, 13
48Friday, April 12, 13