and$the$open$source$ big$data$landscape - chen li · 3/7/15 1 and$the$open$source$...
TRANSCRIPT
3/7/15
1
and the Open Source Big Data Landscape
Michael Carey
Informa8on Systems Group CS Department
UC Irvine
0 #AsterixDB
Rough Topical Plan
• Background and moDvaDon • A dual-‐universe history of Big Data • Big Data landscape (from satellite images J) • AsterixDB: a next-‐generaDon BDMS – AsterixDB viewed from the outside – Internal architecture & soNware stack
• IniDal performance & case studies (if Dme) • Project status and Q&A
1
3/7/15
2
Everyone’s Talking About Big Data
2
• Driven by unprecedented growth in data being generated and its potenDal uses and value – Tweets, social networks (statuses, check-‐ins, shared content), blogs, click streams, various logs, …
– Facebook: > 845M acDve users, > 8B messages/day – TwiGer: > 140M acDve users, > 340M tweets/day
Big Data / Web Warehousing
3
So what went on – and why? What’s going on right now?
What’s going on…?
3/7/15
3
Big Data in the Database World • Enterprises needed to store and query historical business data (data warehouses) – 1980’s: Parallel database systems based on “shared-‐nothing” architectures (Gamma/GRACE, Teradata)
– 2000’s: Netezza, Aster Data, DATAllegro, Greenplum, VerDca, ParAccel (“Big $”acquisiDons!)
• OLTP is another category (a source of Big Data) – 1980’s: Tandem’s NonStop SQL system
4
Parallel Database SoNware Stack
5
Notes: • One storage
manager per machine in a parallel cluster
• Upper layers orchestrate their shared-‐nothing cooperaDon
• One way in/out: through the SQL door at the top
3/7/15
4
Big Data in the Systems World
• Late 1990’s brought a need to index and query the rapidly exploding content of the Web – DB technology tried but failed (e.g., Inktomi) – Google, Yahoo! et al needed to do something
• Google responded by laying a new foundaDon – Google File System (GFS)
• OS-‐level byte stream files spanning 1000’s of machines • Three-‐way replicaDon for fault-‐tolerance (availability)
– MapReduce (MR) programming model • User funcDons: Map and Reduce (and opDonally Combine) • “Parallel programming for dummies” – MR runDme does the heavy liNing via parDDoned parallelism
6
Input Splits (distributed)
Mapper Outputs
Reducer Inputs
Reducer Outputs (distributed)
SHUFFLE PHASE (based on keys)
(MapReduce: Word Count Example)
7
ParCConed Parallelism!
. . .
. . .
3/7/15
5
Soon a Star Was Born…
• Yahoo!, Facebook, and friends read the papers – HDFS and Hadoop MapReduce now in wide use for indexing, clickstream analysis, log analysis, …
• Higher-‐level languages subsequently developed – Pig (Yahoo!) – rel. algebra; Hive (Facebook) – SQL – Pig & Hive both now strongly preferred to bare MR
• Key-‐value (“NoSQL”) stores are another category – Used to power scalable social sites, online games, … – BigTableàHBase, DynamoàCassandra, MongoDB, … – Roughly: distributed B+ tree(s) with get/put client API
8
Open Source Big Data Stack
9
Notes: • Giant byte sequence
files at the botom • Map, sort, shuffle,
reduce layer in middle • Possible storage layer
in middle as well • Now at the top: HLL’s
(Huh…?)
3/7/15
6
Apache Pig (PigLaDn) • ScripDng language inspired by the relaDonal algebra – Compiles down to a series of Hadoop MR jobs – RelaDonal operators include LOAD, FOREACH, FILTER, GROUP, COGROUP, JOIN, ORDER BY, LIMIT, ...
10
Apache Hive (HiveQL)
11
• Query language inspired by an old favorite: SQL – Compiles down to a series of Hadoop MR jobs – Supports various HDFS file formats (text, columnar, ...) – Numerous contenders appearing that take a non-‐MR-‐based runDme approach (duh!) – these include Impala, SDnger, Spark SQL, ...
3/7/15
7
Other Up-‐and-‐Coming Plaworms (I)
12
Distributed memory
Input
query 1
query 2
query 3
. . .
one-‐time processing
• Spark for in-‐memory cluster compuDng – for doing repeDDve data analyses, iteraDve machine learning tasks, ...
iter. 1 iter. 2 . . .
Input
iterative processing
(Especially gaining tracDon for scaling Machine Learning)
Other Up-‐and-‐Coming Plaworms (II) • Bulk Synchronous Programming (BSP) plaworms, e.g., Pregel,
Giraph, GraphLab, ..., for Big Graph analyDcs
13
(“Big” is the plaPorm’s concern)
“Think Like a Vertex” – Receive messages – Update state – Send messages
• Quite a few BSP-‐based plaworms available – Pregel (Google) – Giraph (Facebook, LinkedIn, Twiter, Yahoo!, ...) – Hama (Sogou, Korea Telecomm, ...) – Distributed GraphLab (CMU, Washington) – GraphX (Berkeley) – Pregelix (UCI) – ...
3/7/15
8
14
(Pig)
Also: Today’s Big Data Tangle
AsterixDB: “One Size Fits a Bunch”
15
Semistructured Data Management
Parallel Database Systems
Data-‐Intensive Computing
BDMS Desiderata: • Flexible data model • Efficient runDme • Full query capability • Cost proporDonal to
task at hand (!) • Designed for
conDnuous data ingesDon
• Support today’s “Big Data data types”
• • •
3/7/15
9
• Build a new Big Data Management System (BDMS) – Run on large commodity clusters – Handle mass quanDDes of semistructured data – Openly layered, for selecDve reuse by others – Share with the community via open source
• Conduct scalable informaDon systems research, e.g., – Large-‐scale query processing and workload management – Highly scalable storage and index management – Fuzzy matching, spaDal data, date/Dme data (all in parallel) – Novel support for “fast data” (both in and out)
• Train next generaDon of “Big Data” graduates 16
Project Goals
create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32, alias: string, name: string, user-‐since: dateDme, address: { street: string, city: string, state: string, zip: string, country: string }, friend-‐ids: {{ int32 }}, employment: [EmploymentType] }
ASTERIX Data Model (ADM)
17
create dataset MugshotUsers(MugshotUserType) primary key id;
Highlights include: • JSON++ based data model • Rich type support (spaDal, temporal, …) • Records, lists, bags • Open vs. closed types
create type EmploymentType as open { organizaDon-‐name: string, start-‐date: date, end-‐date: date? }
3/7/15
10
create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32, alias: string, name: string, user-‐since: dateDme, address: { street: string, city: string, state: string, zip: string, country: string }, friend-‐ids: {{ int32 }}, employment: [EmploymentType] }
create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32 }
ASTERIX Data Model (ADM)
18
create dataset MugshotUsers(MugshotUserType) primary key id;
Highlights include: • JSON++ based data model • Rich type support (spaDal, temporal, …) • Records, lists, bags • Open vs. closed types
create type EmploymentType as open { organizaDon-‐name: string, start-‐date: date, end-‐date: date? }
create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32, alias: string, name: string, user-‐since: dateDme, address: { street: string, city: string, state: string, zip: string, country: string }, friend-‐ids: {{ int32 }}, employment: [EmploymentType] }
create dataverse TinySocial; use dataverse TinySocial; create type MugshotUserType as { id: int32 } create type MugshotMessageType as closed { message-‐id: int32, author-‐id: int32, Dmestamp: dateDme, in-‐response-‐to: int32?, sender-‐locaDon: point?, tags: {{ string }}, message: string }
ASTERIX Data Model (ADM)
19
create dataset MugshotUsers(MugshotUserType) primary key id; create dataset MugshotMessages(MugshotMessageType) primary key message-‐id;
Highlights include: • JSON++ based data model • Rich type support (spaDal, temporal, …) • Records, lists, bags • Open vs. closed types
create type EmploymentType as open { organizaDon-‐name: string, start-‐date: date, end-‐date: date? }
3/7/15
11
20
{ "id":1, "alias":"Margarita", "name":"MargaritaStoddard", "address”:{ "street":"234 Thomas Ave", "city":"San Hugo", "zip":"98765", "state":"CA", "country":"USA" } "user-‐since":dateDme("2012-‐08-‐20T10:10:00"), "friend-‐ids":{{ 2, 3, 6, 10 }}, "employment":[{ "organizaDon-‐name":"Codetechno”, "start-‐date":date("2006-‐08-‐06") }] } { "id":2, "alias":"Isbel", "name":"IsbelDull", "address":{ "street":"345 James Ave", "city":"San Hugo", "zip":"98765”, "state":"CA", "country":"USA" }, "user-‐since":dateDme("2011-‐01-‐22T10:10:00"), "friend-‐ids":{{ 1, 4 }}, "employment":[{ "organizaDon-‐name":"Hexviafind”, "start-‐date":date("2010-‐04-‐27") }] } { "id":3, "alias":"Emory", "name":"EmoryUnk", "address":{ "street":"456 Jose Ave", "city":"San Hugo", "zip":"98765", "state":"CA", "country":"USA" }, "user-‐since”: dateDme("2012-‐07-‐10T10:10:00"), "friend-‐ids":{{ 1, 5, 8, 9 }}, "employment”:[{ "organizaDon-‐name":"geomedia”, "start-‐date":date("2010-‐06-‐17"), "end-‐date":date("2010-‐01-‐26") }] } ...
Ex: MugshotUsers Data
create index msUserSinceIdx on MugshotUsers(user-‐since); create index msTimestampIdx on MugshotMessages(Dmestamp); create index msAuthorIdx on MugshotMessages(author-‐id) type btree; create index msSenderLocIndex on MugshotMessages(sender-‐locaDon) type rtree; create index msMessageIdx on MugshotMessages(message) type keyword; create type AccessLogType as closed { ip: string, Dme: string, user: string, verb: string, path: string, stat: int32, size: int32 }; create external dataset AccessLog(AccessLogType) using localfs (("path"="{hostname}://{path}"), ("format"="delimited-‐text"), ("delimiter"="|")); create feed socket_feed using socket_adaptor (("sockets"="{address}:{port}"), ("addressType"="IP"), ("type-‐name"="MugshotMessageType"), ("format"="adm")); connect feed socket_feed to dataset MugshotMessages;
Other DDL Features
21
External data highlights: • Equal opportunity access • “Keep everything!” • Data ingesDon, not streams • Queries unchanged
3/7/15
12
ASTERIX Query Language (AQL)
22
• Ex: List the user name and messages sent by those users who joined the Mugshot social network in a certain Dme window:
for $user in dataset MugshotUsers where $user.user-‐since >= dateDme('2010-‐07-‐22T00:00:00') and $user.user-‐since <= dateDme('2012-‐07-‐29T23:59:59') return { "uname" : $user.name, "messages" : for $message in dataset MugshotMessages where $message.author-‐id = $user.id return $message.message };
22
Nested Opposite of SQL (NJSQL)
23
• Ex: List the user name and messages sent by those users who joined the Mugshot social network in a certain Dme window:
from $user in dataset MugshotUsers where $user.user-‐since >= dateDme('2010-‐07-‐22T00:00:00') and $user.user-‐since <= dateDme('2012-‐07-‐29T23:59:59') select { "uname" : $user.name, "messages" : from $message in dataset MugshotMessages where $message.author-‐id = $user.id select $message.message };
23
3/7/15
13
AQL (cont.)
24
• Ex: IdenDfy acDve users and group/count them by country: let $end := current-‐dateDme() let $start := $end -‐ duraDon("P30D") for $user in dataset MugshotUsers where some $logrecord in dataset AccessLog saCsfies $user.alias = $logrecord.user and dateDme($logrecord.Dme) >= $start and dateDme($logrecord.Dme) <= $end group by $country := $user.address.country with $user return { "country" : $country, "acDve users" : count($user) }
AQL highlights: • Lots of other features (see website!) • SpaDal predicates and aggregaDon • Set-‐similarity matching (next slide!) • And plans for more…
Fuzzy Matching in AQL
25
• Ex: Find messages with similar content (tags):
set simfuncCon "jaccard"; set simthreshold "0.3";
for $msg in dataset MugshotMessages let $msgsSimilarTags := ( for $m2 in dataset MugshotMessages where $m2.tags ~= $msg.tags and $m2.message-‐id != $msg.message-‐id return $m2.message) where count($msgsSimilarTags) > 0 return { "message" : $msg.message, "similarly tagged" : $msgsSimilarTags };
Fuzzy matching highlights: • Not cross-‐product based! • Indexes provided/exploited • Use on text, sets, lists
3/7/15
14
Updates and TransacDons
26
• Key-‐value store-‐like transacDon semanDcs (record level ACIDity)
• Insert/delete ops, index-‐consistent
• 2PL concurrency • WAL no-‐steal, no-‐
force and LSM shadowing
• Ex: Add a new user to Mugshot.com:
insert into dataset MugshotUsers ( { "id":11, "alias":"John", "name":"JohnDoe", "address":{ "street":"789 Jane St", "city":"San Harry", "zip":"98767", "state":"CA", "country":"USA" }, "user-‐since":dateDme("2010-‐08-‐15T08:10:00"), "friend-‐ids":{ { 5, 9, 11 } }, "employment":[{ "organizaDon-‐name":"Kongreen", "start-‐date":date("20012-‐06-‐05") }] } );
AsterixDB System Overview
27 27
Data Loads and Feeds
AQL queries and results
Data publishing
Cluster Controller
MD Node Controller
Node Controller
Node Controller! ! !
Aste
rixD
B
3/7/15
15
AsterixDB System Overview (cont.)
28 28
Hyracks Dataflow
Data
Asterix Client Interface
Metadata Manager AQL Compiler
Metadata ManagerJob Execution
LSM Tree Manager
Hyracks Dataflow
LSM Tree Manager
Nod
e C
ontro
ller (
+ M
etad
ata)
Nod
e C
ontro
ller
Clu
ster
Con
trolle
r
R RR
R R
R
R
RR
Load client AQL client Feed client
R
Data
ASTERIX SoNware Stack
29
Hivesterix Apache VXQuery
Algebricks Algebra Layer M/R LayerPregelix
Hyracks Data-Parallel Platform
Hyracks Job
HadoopM/R JobPregel Job
AQL HiveQL XQuery
AsterixDB
3/7/15
16
The New Kid on the Block!
htp://asterixdb.ics.uci.edu
30
A Peek at AsterixDB Performance
31
Small 10 Node IBM Cluster with -‐ 40 cores -‐ 40 disks (30 data, 10 log) -‐ GB Ethernet switch and similar schema/queries as used in the examples earlier.
3/7/15
17
A Peek at Performance (cont.)
#AsterixDB 32
• Recent/projected use case areas include – Behavioral science – Social data analyDcs – Cell phone event analyDcs – EducaDon – Health care – Power usage monitoring
• Let’s take a quick pick at the first two… – Time permi�ng!
33
Some AsterixDB Use Cases
3/7/15
18
Behavioral Science (HCI)
• First study to use logging and biosensors to measure stress and ICT use of college students in their real world environment (Gloria Mark, UCI InformaDcs) – Focus: MulDtasking and stress among “Millennials”
• MulDple data channels – Computer logging – Heart rate monitors – Daily surveys – General survey – Exit interview
34
Learnings for AsterixDB: • Nature of their analyses • Extended binning support • Data format(s) in and out • Bugs and pain points
Social Data Analysis (Based on 2 pilots)
#AsterixDB 35
Learnings for AsterixDB: • Nature of their analyses • Real vs. syntheDc data • Parallelism (grouping) • Avoiding materializaDon • Bugs and pain points
The underlying AQL query is:
use dataverse twiter; for $t in dataset TweetMessagesShiNed let $region := create-‐rectangle(create-‐point(…, …), create-‐point(…, …)) let $keyword := "mind-‐blowing" where spaDal-‐intersect($t.sender-‐locaDon, $region) and $t.send-‐Dme > dateDme("2012-‐01-‐02T00:00:00Z”) and $t.send-‐Dme < dateDme("2012-‐12-‐31T23:59:59Z”) and contains($t.message-‐text, $keyword) group by $c := spaDal-‐cell($t.sender-‐locaDon, create-‐point(…), 3.0, 3.0) with $t return { "cell” : $c, "count”: count($t) }
3/7/15
19
Current Status
• 4 year iniDal NSF project (250+ KLOC) • AsterixDB BDMS is now here! (@ June 6th, 2013) – Semistructured “NoSQL” style data model – DeclaraDve parallel queries, inserts, deletes, … – LSM-‐based storage/indexes (primary & secondary) – Internal and external datasets both supported – Rich set of data types (including text, Dme, locaDon) – Fuzzy and spaDal query processing – NoSQL-‐like transacDons (for inserts/deletes) – Data feeds and external indexes are waiDng in the wings
36
• Facebook • Yahoo! Research • Pivotal (Greenplum) • Apache SoNware FoundaDon • Oracle Labs • HTC • MicrosoN Research
37
CollaboraDons
• UC Riverside • UC San Diego • UC Santa Cruz • Rice University • IIT Mumbai • HKUST • UCI InformaDcs
3/7/15
20
For More Info
AsterixDB project page: htp://asterixdb.ics.uci.edu
Open source code base: • ASTERIX: htp://code.google.com/p/asterixdb/ • Hyracks: htp://code.google.com/p/hyracks • (Pregelix: htp://hyracks.org/projects/pregelix/)
38
Whoops – I Almost Forgot!
39
• Ex: Here’s the ever popular Word Count example in AQL (J):
39
create type LineType as open {text: string}
create external dataset TextDataset(LineType) using hdfs (("hdfs"="hdfs://mjcarey-‐desktop.ics.uci.edu:54310"),("path"="/
user/raman/input/text/small/textFiles"),("input-‐format"="sequence-‐input-‐format"),("format"="delimited-‐text”));
for $line in dataset TextDataset let $tokens := word-‐tokens($line.text) for $token in $tokens group by $tok := $token with $token return { "word": $tok, "count": count($token) };