prof. dr. stefan edlich €¦ · oracle nosql database conshash config acid no single pf datac...

68
Prof. Dr. Stefan Edlich http://nosql-database.org

Upload: duonghanh

Post on 04-Apr-2018

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Prof. Dr. Stefan Edlich http://nosql-database.org

Page 2: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

2011

The NoSQL

Year!

2011

The NoSQL

Year!

Page 3: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

1. HTML5

2. MongoDB

3. iOS

4. Android

5. Mobile app

6. Puppet

7. Hadoop

8. jQuery

9. PaaS

10. Social Media

Page 4: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 5: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 6: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

CouchDB and

Membase

merger!

1 year ago!

CouchDB and

Membase

merger!

1 year ago!

Page 7: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

+

CouchDBCouchDB MembaseMembase

= ??

Roadmap?

Page 8: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 9: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

No Apache ���� git trouble and more

Less Erlang ���� Code more C / C++

No CouchDB ���� CouchBase Server

����

Damien is leaving CouchDB

Community has to take care of it

no upward compatibility

Page 10: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

“And I'm dead serious about

making it the easiest, fastest

and most reliable NoSQL

database. Easy for developers

to use, easy to deploy,

reliable on single machines or

large clusters, and fast as

hell.”

Page 11: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

UnQL

a successful standard?

unstructured

Damien Katz & Richard Hipp (SQLite)

Page 12: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 13: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 14: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Richard Hipp (SQLite):

“Damien Katz intends to provide

an UnQL interface to CouchDB in

the near future, yes.”

Page 15: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

jaql !

query

dest

Language for JavaScript Object Notation

source

Page 16: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Oracle ?

MS-SQL ?

IBM DB2 ?

Sybase ?

Page 17: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Ted Neward:“Well, the buzz certainly grew, and it surprised

me that the big storage guys (Microsoft, IBM, Oracle)

didn't do more to address it; I was expecting features

to emerge in their database products to address some

of the features present in MongoDB or CouchDB or some

of the others, such as "schemaless" or map/reduce-style

queries. Even just incorporating JavaScript into the engine

somewhere would've generated a reaction.”

Page 18: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 19: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

“The NoSQL databases are beginning

to feel like an ice cream store that

entices you with a new flavor of the

month,” the white paper read. “[But]

you shouldn’t get too attached to any

of the flavors because it may not be

around for too long.”

white paper:

„debunking the (NoSQL) hype“

summer 2011

Page 20: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Oracle NoSQL Database

ConsHash

config ACID

no single PF

DataC Replication

Top Admin

“BerkleyDB reloaded”

Hadoop + Manager

Page 21: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

+=

Page 22: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

user defined functions in C++ & Java

����10x faster then SQL or Stored Procs

UDF connector for Hadoop ���� ☺☺☺☺

C++ APIs for Map Reduce ���� ☺☺☺☺

Page 23: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Greenplum, Pervasive

and 100 others too…

Page 24: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 25: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 26: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 27: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Storage configurable

• round robin automatic loadbalancing

• replicas

• gateway

Page 28: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

- performance

SSD? RAM+DataCenter

+ scale + configure

Attacking:

Mongo & Riak & Cassandra

Page 29: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

bad things too?

Page 30: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

NoSQL = No Security?

less sensitive info?

Page 31: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Key Bruteforce

Array injection/login.php?username=admin&password[$ne]=1

View injection

REST injection

JSON injectiondb.foo.find({$or : {a:1},{b:2},{c:/.*/})

http attacks for listeners

wrong cache proxy configs

thrift avro security

Page 32: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 33: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 34: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

2007 SIGOPS

Page 35: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

• 15 years of experience from Dynamo, SimpleDB and S3

• ultra scalable and reliable

• uses SSD (!)

• fully managed & no maintanance window!

• mutiple syncronous availability zone replication = durability

• provisioned throughput configurable per table

• no fixed schema, any number of attributes & multi value attributes

• consistency and performance tradeoffs possible

• conditional writes & atomic counters

• index: simple hash or composite hash + key/range

• define a table => make a rw capacity reservation

• backup & restore (tables) into S3

• Cloud Watch & Alarms

• 40 million of requests per month free

Page 36: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

2ms read 6-8ms

Page 37: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

1 $ / Gbmonth

0.01 $ per 10 writes / hours

0.01 $ per 50 read / sec up to 1KB

Eventually Consistent = doubles the

read amount

Page 38: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 39: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 40: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 41: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 42: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

{ Id = 101ProductName = „NoSQL Book„ISBN = "978-3446427532„Authors = [ "Author 1", "Author 2" ]Price = -42Dimensions = "8.5 x 11.0 x 0.5„PageCount = 500InPublication = 1ProductCategory = "Book"

}

{ Id = 101ProductName = „NoSQL Book„ISBN = "978-3446427532„Authors = [ "Author 1", "Author 2" ]Price = -42Dimensions = "8.5 x 11.0 x 0.5„PageCount = 500InPublication = 1ProductCategory = "Book"

}

db x tables x items x attribuesdb x tables x items x attribues

uses JSON as serialized transport format!

Page 43: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

REST APITable � create,describe,list,updateData � put(create/update),get,update,delete,query,scan,batch

// This header is abbreviated.// For a sample of a complete header, see link.POST / HTTP/1.1x-amz-target: DynamoDB_20111205.PutItemcontent-type: application/x-amz-json-1.0

{"TableName":"Table1 ","Item ":{ "AttributeName1 ":{"AttributeValue1 ":"S"},

"AttributeName2 ":{"AttributeValue2 ":"N"},},"Expected":{"AttributeName3 ":{"Value ": {"S":"AttributeValue "},{"Exists":Boolean}},"ReturnValues":"ReturnValuesConstant"}

HTTP/1.1 200x-amzn-RequestId: 8966d095-71e9-11e0-a498-71d736f27375content-type: application/x-amz-json-1.0content-length: 85

{"Attributes":{"AttributeName3":{"S":"AttributeValue3"},"AttributeName2":{"SS":"AttributeValue2"},"AttributeName1":{"SS":"AttributeValue1"},},

"ConsumedCapacityUnits":1 }

Page 44: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

AWS SDK for Java, .NET, PHP

// Java getprivate static void getBook(String id, String tableName) {

GetItemRequest getItemRequest = new GetItemRequest().withTableName(tableName).withKey(new Key().withHashKeyElement(new Attribute Value().withN(id)).withAttributesToGet(Arrays.asList("Id", "ISBN", "T itle", "Authors"));

GetItemResult result = client.getItem(getItemRequest) ;

System.out.println("Printing item after retrieving it....");printItem(result.getItem());

}

Page 45: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

64 KB Data Limit

string + int

multi value string-ints

multiKV, references,

schemachecks, …

API ���� DSLs ☺☺☺☺

64 KB Data Limit

string + int

multi value string-ints

multiKV, references,

schemachecks, …

API ���� DSLs ☺☺☺☺

Page 46: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 47: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Here are the six urban myths that Mr. Stonebraker

says NoSQL advocates incorrectly perpetuate:

• Myth #1: SQL is too slow,

so use a lower level interface

• Myth #2: I like a K-V interface, so SQL

is a non-starter

• Myth #3: SQL systems don’t scale

• Myth #4: There are no open source,

scalable SQL engines

• Myth #5: ACID is too slow, so avoid using it

• Myth #6: in CAP, choose AP over CA

Page 48: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

strikes back

Page 49: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

© 451 Group Report / 5.4.2011

Page 50: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Overview

Page 51: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 52: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Java Stored Procedures!

Page 53: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

RAM with 100.000 ops/sNode

“VoltDB claims to be 100 times

faster than MySQL, up to 13 times

faster than Cassandra, and 45 times

faster than Oracle, with near-linear

scaling.” (highscalability blog)

ACID with partitioned tables

Nearly SQL 99 and ALTER &DROP

schema changes require Shutdown

static query parametrization

Page 54: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Quelle: Pecond MySQL Performance Blog

Page 55: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 56: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

SSD optimized and disk

C’t: 10-100 TB ok then weaker

10x faster

scaling across cores

random access read pattern

Page 57: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

QPS on SSD

84.42614.763

5,5 x

faster

− memcached API more soon

− no structured data

− horizontal scaling for nodes

Page 58: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

� "terabytes of data, billions of objects, and 200K plus

transactions per second per node, with sub-millisecond latency."

� e.g. real-time bidding

� transactions / ACID

� linear & elastic horizontal scalable

� flash/SSD support

RTARTARTARTATMTMTMTM

� data expiration

� append list

Page 59: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

� API: C, C#, Java, Ruby, Python & PHP

� no master node

� 200k Ops/secNode read 50k Ops/secNode write

Page 60: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded
Page 61: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Check hybrid solutions!

easier & better then memcache + RDBMS

Page 62: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Problem: privilege checks, cach queries, connection pooling / thread creation,

parsing SQL, open, lock, exec plans, concurrency control, unlock, close, …

© fromdual.com

Page 63: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

QUELLE: YOSHINORI MATSUNOBU

keep tables open & simple protocol

Page 64: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Performance

Transactions

Concurrent Access

No Cache / Crash-Safe

no SQL but more then

K/V: ranges, LIMIT, CRUD, multi_get,…

no Security

new API

Page 65: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

© percona.com

200

100

Page 66: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Conclusion #1

Page 67: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

Conclusion #2

There is no

“one perfect solution”

Check hybrid solutions

and NewSQL DBs too!

Page 68: Prof. Dr. Stefan Edlich €¦ · Oracle NoSQL Database ConsHash config ACID no single PF DataC Replication Top Admin “BerkleyDB reloaded

© geekandpoke.com