hbase +fosdem+2010+nosql 2

Upload: lahorichargha

Post on 09-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    1/43

    My Life with HBase

    Lars George, CTO of WorldLingo

    Apache Hadoop HBase Committerwww.worldlingo.com www.larsgeorge.com

    http://www.worldlingo.com/http://www.larsgeorge.com/http://www.larsgeorge.com/http://www.worldlingo.com/
  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    2/43

    WorldLingo

    Co-founded 1999

    Machine Translation Services

    Professional Human Translations

    Offices in US and UK

    Microsoft Office Provider since 2001

    Web based services

    Customer Projects

    Multilingual Archive

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    3/43

    Multilingual Archive

    SOAP API

    Simple calls

    putDocument()

    getDocument()

    search()

    command()

    putTransformation() getTransformation()

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    4/43

    Multilingual Archive (cont.)

    Planned already, implemented as customerproject

    Scale:

    500million documents

    Random Access

    100% Uptime

    Technologies? Database

    Zip-Archives on file system, or Hadoop

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    5/43

    RDBMS Woes

    Scaling MySQL hard, Oracle expensive (andhard)

    Machine cost goes up faster speed Turn off all relational features to scale Turn off secondary indexes too Tables can be a problem at sizes as low as

    500GB Hard to read data quickly at these sizes Write speed degrades with table size Future growth uncertain

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    6/43

    MySQL Limitations

    Master becomes a problem

    What if your write speed is greater than asingle machine

    All slaves must have same write capacitiesas master (cant check out on slaves)

    Single point of failure, no easy failover

    Can (sort of) solve this with sharding

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    7/43

    Sharding

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    8/43

    Sharding Problems

    Requires either a hashing function ormapping table to determine shard

    Data access code becomes complex

    What if shard sizes become too large?

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    9/43

    Resharding

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    10/43

    Schema Changes

    What about schema changes ormigrations?

    MySQL not your friend here

    Only gets harder with more data

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    11/43

    HBase to the Rescue

    Clustered, commodity(-ish) hardware

    Mostly schema-less

    Dynamic distribution

    Spreads writes out over the cluster

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    12/43

    HBase

    Distributed database modeled on Bigtable Bigtable: A Distributed Storage System for

    Structured Data by Chang et al.

    Runs on top of Hadoop Core Layers on HDFS for storage

    Native connections to MapReduce

    Distributed, High Availability, HighPerformance, Strong Consistency

    http://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.html
  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    13/43

    HBase

    Column-oriented store Wide table costs only the data stored

    NULLs in row are 'free'

    Good compression: columns of similar type Column name is arbitrary

    Rows stored in sorted order

    Can random read and write Goal of billions of rows X millions of cells

    Petabytes of data across thousands of servers

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    14/43

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    15/43

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    16/43

    Tables

    Table is split into roughly equal sizedregions

    Each region is a contiguous range of keys,

    from [start, to end) Regions split as they grow, thus

    dynamically adjusting to your data set

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    17/43

    Tables (cont.)

    Tables are sorted by Row

    Table schema defines column families

    Families consist of any number of columns

    Columns consist of any number of versions

    Everything except table name is byte[]

    (Table, Row, Family:Column, Timestamp) Value

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    18/43

    Tables (cont.)

    As a data structure

    SortedMap(

    RowKey, List(

    SortedMap(

    Column, List(

    Value, Timestamp

    )

    )

    )

    )

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    19/43

    Server Architecture

    Similar to HDFS Master Namenode

    Regionserver Datanode

    Often run these alongsaide each other! Difference: HBase stores state in HDFS

    HDFS provides robust data storageacross machines, insulating against failure

    Master and Regionserver fairly statelessand machine independent

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    20/43

    Region Assignment

    Each region from every table is assignedto a Regionserver

    Master Duties:

    Reponsible for assignment and handlingregionserver problems (if any!)

    When machines fail, move regions

    When regions split, move regions to balance Could move regions to respond to load

    Can run multiple backup masters

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    21/43

    Master

    The master does NOT Handle any write requests (not a DB master!)

    Handle location finding requests

    Not involved in the read/write path Generally does very little most of the time

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    22/43

    Distributed Coordination

    Zookeeper is used to manage masterelection and server availability

    Set up as a cluster, provides distributed

    coordination primitives An excellent tool for building cluster

    management systems

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    23/43

    HBase Storage Architecture

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    24/43

    HBase Public Timeline

    November 2006 Google releases paper on Bigtable

    February 2007 Initial HBase prototype created as Hadoop contrib

    October 2007

    First "useable" HBase (0.15.0 Hadoop) December 2007

    First HBase User Group

    January 2008 Hadoop becomes TLP, HBase becomes subproject

    October 2008 HBase 0.18.1 released

    January 2009 HBase 0.19.0 released

    September 2009 HBase 0.20.0 released

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    25/43

    HBase WorldLingo Timeline

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    26/43

    HBase - Example

    Store web crawl data Table crawl with family content

    Row is URL with columns

    content:data stores raw crawled data content:language stores http language header

    content:type stores http content-type header

    If processing raw data for hyperlinks andimages, add families links and images

    links: column for each hyperlink

    links: column for each image

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    27/43

    HBase - Clients

    Native Java client/API get(Get get)

    put(Put put)

    Non-Java clients Thrift server (Ruby, C++, Erlang, etc.)

    REST server (Stargate)

    TableInput/TableOutputFormat forMapReduce

    HBase shell (jruby)

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    28/43

    Scaling HBase

    Add more machines to scale Automatic rebalancing

    Base model (BigTable) scales past 1000TB

    No inherent reason why Hbase couldnt

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    29/43

    What to store in HBase

    Maybe not your raw log data...

    ... but the results of processing it withHadoop!

    By storing the refined version in HBase,can keep up with huge data demands andserve to your website

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    30/43

    !HBase

    NoSQL Database! No joins

    No sophisticated query engine

    No transactions (sort of) No column typing

    No SQL, no ODBC/JDBC, etc. (but there is

    HBql now!) Not a replacement for your RDBMS...

    Matching Impedance!

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    31/43

    Why HBase?

    Datasets are reaching Petabytes Traditional databases are expensive to

    scale and difficult to distribute

    Commodity hardware is cheap andpowerful (but HBase can make use ofpowerful machines too!)

    Need for random access and batchprocessing (which Hadoop does notoffer)

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    32/43

    Numbers

    Single reads are 1-10ms depending ondisk seeks and caching

    Scans can return hundreds of rows in

    dozens of ms Serial read speeds

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    33/43

    Multilingual Archive (cont.)

    44 Dell PESC1435, 12GB RAM, 2 x 1TBSATA drives

    Java 6

    Tomcat 5.5 88 Xen domUs

    Apache

    Hadoop/HBase

    Tomcat application servers

    Currently split into two clusters

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    34/43

    Lucene Search Server

    43 fields indexed 166GB size

    Automated merging/warm-up/swap

    Looking into scalable solution Katta

    Hyper Estraier

    DLucene

    Sorting?

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    35/43

    Multilingual Archive (cont.)

    5 Tables Up to 5 column families

    XML Schemas

    Automated table schema updates

    Standard options tweaked over time

    Garbage Collection!

    MemCached(b) layer

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    36/43

    Layers

    Data

    Cache

    App

    Web

    LWS

    Network Firewall

    Director 1

    Apache 1

    Tomcat 1

    MemCached1

    HBase

    MemCachedn

    Tomcat n

    Apache n

    Tomcat 1 Tomcat n

    Director n

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    37/43

    Map/Reduce

    Backup/Restore Index building

    Cache filling

    Mapping

    Updates

    Translation

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    38/43

    HBase - Problems

    Early versions (before HBase 0.19.0!) Data loss

    Migration nightmares

    Slow performance

    Current version

    Read HBaseWiki!!! Single point of failure (name node only!)

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    39/43

    HBase - Notes

    RTF(ine)M HBaseWiki, IRC Channel

    Personal Experience:

    Max. file handles (32k+) Hadoop xceiver limits (NIO?)

    Redundant meta data (on name node)

    RAM (4GB+)

    Deployment strategy Garbage collection (use CMS, G1?)

    Maybe not mix batch and interactive?

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    40/43

    Graphing

    Use supplied Ganglia context or JMXbridge to enable Nagios and Cacti

    JMXToolkit: swiss army knife for JMX

    enabled servers:http://github.com/larsgeorge/jmxtoolkit

    http://github.com/larsgeorge/jmxtoolkithttp://github.com/larsgeorge/jmxtoolkit
  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    41/43

    HBase - Roadmap

    HBase 0.20.x Performance New Key FormatKeyValue New File FormatHfile New Block CacheConcurrent LRU

    New Query and Result API New Scanners Zookeeper IntegrationNo SPOF in HBase New REST Interface

    Contrib Transactional Tables Secondary Indexes Stargate

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    42/43

    HBase - Roadmap (cont.)

    HBase 0.21.x Advanced Concepts Master RewriteMore Zookeeper

    New RPC Protocol (Avro)

    Multi-DC Replication Intra Row Scanning

    Further optimizations on algorithms and data

    structures Discretionary Access Control

    Coprocessors

  • 8/8/2019 Hbase +Fosdem+2010+Nosql 2

    43/43

    Questions?

    Email: [email protected]@apache.org

    [email protected]

    Blog: www.larsgeorge.com

    Twitter: larsgeorge

    mailto:[email protected]:[email protected]:[email protected]://www.larsgeorge.com/http://www.larsgeorge.com/mailto:[email protected]:[email protected]:[email protected]