and$the$open$source$ big$data$landscape - chen li · 3/7/15 1 and$the$open$source$...

3/7/15

1

and the Open Source Big Data Landscape

Michael Carey

Informa8on Systems Group CS Department

UC Irvine

0 #AsterixDB

Rough Topical Plan

•  Background and moDvaDon •  A dual-‐universe history of Big Data •  Big Data landscape (from satellite images J) •  AsterixDB: a next-‐generaDon BDMS – AsterixDB viewed from the outside –  Internal architecture & soNware stack

•  IniDal performance & case studies (if Dme) •  Project status and Q&A

1

3/7/15

2

Everyone’s Talking About Big Data

2

•  Driven by unprecedented growth in data being generated and its potenDal uses and value –  Tweets, social networks (statuses, check-‐ins, shared content), blogs, click streams, various logs, …

–  Facebook: > 845M acDve users, > 8B messages/day –  TwiGer: > 140M acDve users, > 340M tweets/day

Big Data / Web Warehousing

3

So what went on – and why? What’s going on right now?

What’s going on…?

3/7/15

3

Big Data in the Database World •  Enterprises needed to store and query historical business data (data warehouses) –  1980’s: Parallel database systems based on “shared-‐nothing” architectures (Gamma/GRACE, Teradata)

–  2000’s: Netezza, Aster Data, DATAllegro, Greenplum, VerDca, ParAccel (“Big $”acquisiDons!)

•  OLTP is another category (a source of Big Data) –  1980’s: Tandem’s NonStop SQL system

4

Parallel Database SoNware Stack

5

Notes: •  One storage

manager per machine in a parallel cluster

•  Upper layers orchestrate their shared-‐nothing cooperaDon

•  One way in/out: through the SQL door at the top

3/7/15

4

Big Data in the Systems World

•  Late 1990’s brought a need to index and query the rapidly exploding content of the Web – DB technology tried but failed (e.g., Inktomi) – Google, Yahoo! et al needed to do something

•  Google responded by laying a new foundaDon – Google File System (GFS)

•  OS-‐level byte stream files spanning 1000’s of machines •  Three-‐way replicaDon for fault-‐tolerance (availability)

– MapReduce (MR) programming model •  User funcDons: Map and Reduce (and opDonally Combine) •  “Parallel programming for dummies” – MR runDme does the heavy liNing via parDDoned parallelism

6

Input Splits (distributed)

Mapper Outputs

Reducer Inputs

Reducer Outputs (distributed)

SHUFFLE PHASE (based on keys)

(MapReduce: Word Count Example)

7

ParCConed Parallelism!

. . .

. . .

3/7/15

5

Soon a Star Was Born…

•  Yahoo!, Facebook, and friends read the papers – HDFS and Hadoop MapReduce now in wide use for indexing, clickstream analysis, log analysis, …

•  Higher-‐level languages subsequently developed –  Pig (Yahoo!) – rel. algebra; Hive (Facebook) – SQL –  Pig & Hive both now strongly preferred to bare MR

•  Key-‐value (“NoSQL”) stores are another category – Used to power scalable social sites, online games, … –  BigTableàHBase, DynamoàCassandra, MongoDB, … –  Roughly: distributed B+ tree(s) with get/put client API

8

Open Source Big Data Stack

9

Notes: •  Giant byte sequence

files at the botom •  Map, sort, shuffle,

reduce layer in middle •  Possible storage layer

in middle as well •  Now at the top: HLL’s

(Huh…?)

3/7/15

6

Apache Pig (PigLaDn) •  ScripDng language inspired by the relaDonal algebra –  Compiles down to a series of Hadoop MR jobs –  RelaDonal operators include LOAD, FOREACH, FILTER, GROUP, COGROUP, JOIN, ORDER BY, LIMIT, ...

10

Apache Hive (HiveQL)

11

•  Query language inspired by an old favorite: SQL –  Compiles down to a series of Hadoop MR jobs –  Supports various HDFS file formats (text, columnar, ...) –  Numerous contenders appearing that take a non-‐MR-‐based runDme approach (duh!) – these include Impala, SDnger, Spark SQL, ...

3/7/15

7

Other Up-‐and-‐Coming Plaworms (I)

12

Distributed memory

Input

query 1

query 2

query 3

. . .

one-‐time processing

•  Spark for in-‐memory cluster compuDng – for doing repeDDve data analyses, iteraDve machine learning tasks, ...

iter. 1 iter. 2 . . .

Input

iterative processing

(Especially gaining tracDon for scaling Machine Learning)

Other Up-‐and-‐Coming Plaworms (II) •  Bulk Synchronous Programming (BSP) plaworms, e.g., Pregel,

Giraph, GraphLab, ..., for Big Graph analyDcs

13

(“Big” is the plaPorm’s concern)

“Think Like a Vertex” –  Receive messages –  Update state –  Send messages

•  Quite a few BSP-‐based plaworms available –  Pregel (Google) –  Giraph (Facebook, LinkedIn, Twiter, Yahoo!, ...) –  Hama (Sogou, Korea Telecomm, ...) –  Distributed GraphLab (CMU, Washington) –  GraphX (Berkeley) –  Pregelix (UCI) –  ...

3/7/15

8

14

(Pig)

Also: Today’s Big Data Tangle

AsterixDB: “One Size Fits a Bunch”

15

Semistructured Data Management

Parallel Database Systems

Data-‐Intensive Computing

BDMS Desiderata: •  Flexible data model •  Efficient runDme •  Full query capability •  Cost proporDonal to

task at hand (!) •  Designed for

conDnuous data ingesDon

•  Support today’s “Big Data data types”

•  •  • 

3/7/15

9

•  Build a new Big Data Management System (BDMS) –  Run on large commodity clusters –  Handle mass quanDDes of semistructured data –  Openly layered, for selecDve reuse by others –  Share with the community via open source

•  Conduct scalable informaDon systems research, e.g., –  Large-‐scale query processing and workload management –  Highly scalable storage and index management –  Fuzzy matching, spaDal data, date/Dme data (all in parallel) –  Novel support for “fast data” (both in and out)

•  Train next generaDon of “Big Data” graduates 16

Project Goals

create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32, alias: string, name: string, user-‐since: dateDme, address: { street: string, city: string, state: string, zip: string, country: string }, friend-‐ids: {{ int32 }}, employment: [EmploymentType] }

ASTERIX Data Model (ADM)

17

create dataset MugshotUsers(MugshotUserType) primary key id;

Highlights include: •  JSON++ based data model •  Rich type support (spaDal, temporal, …) •  Records, lists, bags •  Open vs. closed types

create type EmploymentType as open { organizaDon-‐name: string, start-‐date: date, end-‐date: date? }

3/7/15

10


create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32 }


18

create dataset MugshotUsers(MugshotUserType) primary key id;




create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32 } create type MugshotMessageType as closed { message-‐id: int32, author-‐id: int32, Dmestamp: dateDme, in-‐response-‐to: int32?, sender-‐locaDon: point?, tags: {{ string }}, message: string }


19

create dataset MugshotUsers(MugshotUserType) primary key id; create dataset MugshotMessages(MugshotMessageType) primary key message-‐id;



3/7/15

11

20

{ "id":1, "alias":"Margarita", "name":"MargaritaStoddard", "address”:{ "street":"234 Thomas Ave", "city":"San Hugo", "zip":"98765", "state":"CA", "country":"USA" } "user-‐since":dateDme("2012-‐08-‐20T10:10:00"), "friend-‐ids":{{ 2, 3, 6, 10 }}, "employment":[{ "organizaDon-‐name":"Codetechno”, "start-‐date":date("2006-‐08-‐06") }] } { "id":2, "alias":"Isbel", "name":"IsbelDull", "address":{ "street":"345 James Ave", "city":"San Hugo", "zip":"98765”, "state":"CA", "country":"USA" }, "user-‐since":dateDme("2011-‐01-‐22T10:10:00"), "friend-‐ids":{{ 1, 4 }}, "employment":[{ "organizaDon-‐name":"Hexviafind”, "start-‐date":date("2010-‐04-‐27") }] } { "id":3, "alias":"Emory", "name":"EmoryUnk", "address":{ "street":"456 Jose Ave", "city":"San Hugo", "zip":"98765", "state":"CA", "country":"USA" }, "user-‐since”: dateDme("2012-‐07-‐10T10:10:00"), "friend-‐ids":{{ 1, 5, 8, 9 }}, "employment”:[{ "organizaDon-‐name":"geomedia”, "start-‐date":date("2010-‐06-‐17"), "end-‐date":date("2010-‐01-‐26") }] } ...

Ex: MugshotUsers Data

create index msUserSinceIdx on MugshotUsers(user-‐since); create index msTimestampIdx on MugshotMessages(Dmestamp); create index msAuthorIdx on MugshotMessages(author-‐id) type btree; create index msSenderLocIndex on MugshotMessages(sender-‐locaDon) type rtree; create index msMessageIdx on MugshotMessages(message) type keyword; create type AccessLogType as closed { ip: string, Dme: string, user: string, verb: string, path: string, stat: int32, size: int32 }; create external dataset AccessLog(AccessLogType) using localfs (("path"="{hostname}://{path}"), ("format"="delimited-‐text"), ("delimiter"="|")); create feed socket_feed using socket_adaptor (("sockets"="{address}:{port}"), ("addressType"="IP"), ("type-‐name"="MugshotMessageType"), ("format"="adm")); connect feed socket_feed to dataset MugshotMessages;

Other DDL Features

21

External data highlights: •  Equal opportunity access •  “Keep everything!” •  Data ingesDon, not streams •  Queries unchanged

3/7/15

12

ASTERIX Query Language (AQL)

22

•  Ex: List the user name and messages sent by those users who joined the Mugshot social network in a certain Dme window:

for $user in dataset MugshotUsers where $user.user-‐since >= dateDme('2010-‐07-‐22T00:00:00') and $user.user-‐since <= dateDme('2012-‐07-‐29T23:59:59') return { "uname" : $user.name, "messages" : for $message in dataset MugshotMessages where $message.author-‐id = $user.id return $message.message };

22

Nested Opposite of SQL (NJSQL)

23

•  Ex: List the user name and messages sent by those users who joined the Mugshot social network in a certain Dme window:

from $user in dataset MugshotUsers where $user.user-‐since >= dateDme('2010-‐07-‐22T00:00:00') and $user.user-‐since <= dateDme('2012-‐07-‐29T23:59:59') select { "uname" : $user.name, "messages" : from $message in dataset MugshotMessages where $message.author-‐id = $user.id select $message.message };

23

3/7/15

13

AQL (cont.)

24

•  Ex: IdenDfy acDve users and group/count them by country: let $end := current-‐dateDme() let $start := $end -‐ duraDon("P30D") for $user in dataset MugshotUsers where some $logrecord in dataset AccessLog saCsfies $user.alias = $logrecord.user and dateDme($logrecord.Dme) >= $start and dateDme($logrecord.Dme) <= $end group by $country := $user.address.country with $user return { "country" : $country, "acDve users" : count($user) }

AQL highlights: •  Lots of other features (see website!) •  SpaDal predicates and aggregaDon •  Set-‐similarity matching (next slide!) •  And plans for more…

Fuzzy Matching in AQL

25

•  Ex: Find messages with similar content (tags):

set simfuncCon "jaccard"; set simthreshold "0.3";

for $msg in dataset MugshotMessages let $msgsSimilarTags := ( for $m2 in dataset MugshotMessages where $m2.tags ~= $msg.tags and $m2.message-‐id != $msg.message-‐id return $m2.message) where count($msgsSimilarTags) > 0 return { "message" : $msg.message, "similarly tagged" : $msgsSimilarTags };

Fuzzy matching highlights: •  Not cross-‐product based! •  Indexes provided/exploited •  Use on text, sets, lists

3/7/15

14

Updates and TransacDons

26

•  Key-‐value store-‐like transacDon semanDcs (record level ACIDity)

•  Insert/delete ops, index-‐consistent

•  2PL concurrency •  WAL no-‐steal, no-‐

force and LSM shadowing

•  Ex: Add a new user to Mugshot.com:

insert into dataset MugshotUsers ( { "id":11, "alias":"John", "name":"JohnDoe", "address":{ "street":"789 Jane St", "city":"San Harry", "zip":"98767", "state":"CA", "country":"USA" }, "user-‐since":dateDme("2010-‐08-‐15T08:10:00"), "friend-‐ids":{ { 5, 9, 11 } }, "employment":[{ "organizaDon-‐name":"Kongreen", "start-‐date":date("20012-‐06-‐05") }] } );

AsterixDB System Overview

27 27

Data Loads and Feeds

AQL queries and results

Data publishing

Cluster Controller

MD Node Controller

Node Controller

Node Controller! ! !

Aste

rixD

B

3/7/15

15

AsterixDB System Overview (cont.)

28 28

Hyracks Dataflow

Data

Asterix Client Interface

Metadata Manager AQL Compiler

Metadata ManagerJob Execution

LSM Tree Manager

Hyracks Dataflow

LSM Tree Manager

Nod

e C

ontro

ller (

+ M

etad

ata)

Nod

e C

ontro

ller

Clu

ster

Con

trolle

r

R RR

R R

R

R

RR

Load client AQL client Feed client

R

Data

ASTERIX SoNware Stack

29

Hivesterix Apache VXQuery

Algebricks Algebra Layer M/R LayerPregelix

Hyracks Data-Parallel Platform

Hyracks Job

HadoopM/R JobPregel Job

AQL HiveQL XQuery

AsterixDB

3/7/15

16

The New Kid on the Block!

htp://asterixdb.ics.uci.edu

30

A Peek at AsterixDB Performance

31

Small 10 Node IBM Cluster with -‐  40 cores -‐  40 disks (30 data, 10 log) -‐  GB Ethernet switch and similar schema/queries as used in the examples earlier.

3/7/15

17

A Peek at Performance (cont.)

#AsterixDB 32

•  Recent/projected use case areas include –  Behavioral science –  Social data analyDcs –  Cell phone event analyDcs –  EducaDon –  Health care –  Power usage monitoring

•  Let’s take a quick pick at the first two… –  Time permi�ng!

33

Some AsterixDB Use Cases

3/7/15

18

Behavioral Science (HCI)

•  First study to use logging and biosensors to measure stress and ICT use of college students in their real world environment (Gloria Mark, UCI InformaDcs) –  Focus: MulDtasking and stress among “Millennials”

•  MulDple data channels –  Computer logging –  Heart rate monitors –  Daily surveys –  General survey –  Exit interview

34

Learnings for AsterixDB: •  Nature of their analyses •  Extended binning support •  Data format(s) in and out •  Bugs and pain points

Social Data Analysis (Based on 2 pilots)

#AsterixDB 35

Learnings for AsterixDB: •  Nature of their analyses •  Real vs. syntheDc data •  Parallelism (grouping) •  Avoiding materializaDon •  Bugs and pain points

The underlying AQL query is:

use dataverse twiter; for $t in dataset TweetMessagesShiNed let $region := create-‐rectangle(create-‐point(…, …), create-‐point(…, …)) let $keyword := "mind-‐blowing" where spaDal-‐intersect($t.sender-‐locaDon, $region) and $t.send-‐Dme > dateDme("2012-‐01-‐02T00:00:00Z”) and $t.send-‐Dme < dateDme("2012-‐12-‐31T23:59:59Z”) and contains($t.message-‐text, $keyword) group by $c := spaDal-‐cell($t.sender-‐locaDon, create-‐point(…), 3.0, 3.0) with $t return { "cell” : $c, "count”: count($t) }

3/7/15

19

Current Status

•  4 year iniDal NSF project (250+ KLOC) •  AsterixDB BDMS is now here! (@ June 6th, 2013) –  Semistructured “NoSQL” style data model –  DeclaraDve parallel queries, inserts, deletes, … –  LSM-‐based storage/indexes (primary & secondary) –  Internal and external datasets both supported –  Rich set of data types (including text, Dme, locaDon) –  Fuzzy and spaDal query processing –  NoSQL-‐like transacDons (for inserts/deletes) –  Data feeds and external indexes are waiDng in the wings

36

•  Facebook •  Yahoo! Research •  Pivotal (Greenplum) •  Apache SoNware FoundaDon •  Oracle Labs •  HTC •  MicrosoN Research

37

CollaboraDons

•  UC Riverside •  UC San Diego •  UC Santa Cruz •  Rice University •  IIT Mumbai •  HKUST •  UCI InformaDcs

3/7/15

20

For More Info

AsterixDB project page: htp://asterixdb.ics.uci.edu

Open source code base: •  ASTERIX: htp://code.google.com/p/asterixdb/ •  Hyracks: htp://code.google.com/p/hyracks •  (Pregelix: htp://hyracks.org/projects/pregelix/)

38

Whoops – I Almost Forgot!

39

•  Ex: Here’s the ever popular Word Count example in AQL (J):

39

create type LineType as open {text: string}

create external dataset TextDataset(LineType) using hdfs (("hdfs"="hdfs://mjcarey-‐desktop.ics.uci.edu:54310"),("path"="/

user/raman/input/text/small/textFiles"),("input-‐format"="sequence-‐input-‐format"),("format"="delimited-‐text”));

for $line in dataset TextDataset let $tokens := word-‐tokens($line.text) for $token in $tokens group by $tok := $token with $token return { "word": $tok, "count": count($token) };

and$the$open$source$ big$data$landscape - chen li · 3/7/15 1 and$the$open$source$...

Documents