bleeding edge databases
DESCRIPTION
On Aerospike, AlgebraixData and Google BigQuery for BigDataCampLATRANSCRIPT
Bleeding Edge Databases
@LynnLang i t
Unstructured Data
Live Tweets on a Building
What is Aerospike?
Real-time NoSQL• Flash Optimized• In-memory• Exponentially
Scalable
Super Fast• 1M TPS on one
server (reads)• 40K TPS on one
server (writes)
More• ACID
compliance• Tunable
Consistency
Benchmark Results• 200,000 tps (read-write) & 300,000 tps (read-heavy)• 10X Faster for R/W loads on SSDs
DEMO
More Benchmark Results
Config• 10G network• Aerospike 3• Same hardware• 4-node CentOS
Data• 500GB• 50M recordsEach Record • 100 bytes• 23 byte key• 10 fields
Aerospike Architecture
Example Architecture
How to try it out
• Bare metal or pick a Cloud, set up a VM• Get the free community edition• Go…
Linked Open Data Cloud
What is Algebraix Data?
IoT – Semantic Web
Super Powerful 1 Billion
Triples on 1 Node
Native Mathematical
Engine
Triple store RDF (Graph)
SPARQL Server™
W3C & OGC compliant RDF / SPARQL Semantic Database
Natively built with proprietary Math• Algebraix technology (and patents)
Runs on commodity hardware• In the cloud (or on premise)• Scales Up and Down
Significantly better benchmark performance• over leading RDF databases
Benchmark Results
• SP2Bench SPARQL Performance Benchmark
SP^2 Benchmark Visualized
DEMO
It’s the Math…
Patents
Runs on common hardware
• Any Cloud or• On Rremises
High Performance & Capacity
• Needs no indexes
• Works particularly well w/sparse data
Self-tuning
• Retains results & intermediate sets
• Supports point-in-time queries
SPARQL Server™
Algebraix Solution Stack
Data Algebra
DatabaseNoSQL Relational
RDF Semantic
ApplicationsMeaning
Organization
Optimization& Execution
Conceptual
Data Loaders Query Translators
• Modern abstract algebra• Zermelo-Fraenkel set theory
• Mathematics-based data management platform• Universal data language• Collection of I.P.
• SPARQL Server – RDF• A2DB - Relational
• Search• Analytics• Business Intelligence• Data Integration
Algebraix Platform
How to try it out
• Sign up on their website• Try out when notified (this July)
What is Google Big Query?
QaaS – interactive
RESTful web service
SQL-like language
Queries data stored in Google
cloud
Wide Column Tables
Uses OAuth for
access control
Very Fast 750M
Rows in <10 secs
Easy & Fast
• Text or Json• Up to 100k inserts/sec (streaming)
Load it
• Supports core SQL query concepts• SELECT, FROM, JOIN, WHERE, ORDER BY, GROUP BY • Windowing functions (OVER / PARTITION)• Common Aggregates (SUM, COUNT, MAX)
• Includes ‘analytic’ SQL• STDDEV, VARIANCE, CORRELATION• REGEXP_MATCH
Query it
• Query is $ 5 per TB processed• Storage is around $30 TB per month
Pay (for) it
Benchmark Results
• TCP-H Benchmark
DEMO
Partners and BigQuery
Google Sheets Tableau QlikView
Bime Excel
How to try it out
• Set up a Google Cloud account• Upload or stream data• Query
Google Cloud Starter Pack
Use code“gde-in”
Next steps
Try them out
@LynnLang i t