beware of your hype value stores

Post on 28-Nov-2014

5.423 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Key value stores are popping all around the web, describing themselves as the best fit for your webapp… but take care of the hype ! Think you’ll have 10000+ writes a sec ? There’s a lot of unsaid “special features” that you have to know. Don’t trust benchmarks, even your own. Here’s how to choose the right key value store for your app ! The video is also available on youtube: http://www.youtube.com/watch?v=YZD8-EzozKQ

TRANSCRIPT

Beware of your Hype – Value store !

Ignite Velocity 2009

Jérémie BORDIER -

@ahfeel / jeremie.bordier@exalead.com

GettheHype!

•  Rela&onalDBMSareso1990…•  Sosimple,sofast,lookspowerful!

2 Tokyo Tyrant

Redis

Scalaris

Project Voldemort Hypertable Cassandra

MemcacheDB

MongoDB

Persevere BerkeleyDB

CouchDB

Dynomite SimpleDB Dynamo

10000+writes/sec!

•  RDBMSonlyperforms±300w/sec…wC?

3

Let’s do some simple math !

Stateoftheartmathema8cs

•  AverageSATAIIdiskseek&me:±6ms•  VerygoodSCSIdisk:±3ms

4

1 second / 3ms = ±300 REAL writes / sec

EventualPersistency

•  Doesn’tsyncthefilesystem•  Keepthatinmind!

YourserversWILLcrash,youWILLlosedata

5

Don’tcareaboutwrites…

•  Ifcando30000+lookup/sec!

6

Not for long !

Eventualmemory

•  ReliesonB+Trees• Mapallthedatastructuresinmemory

7

What if your data goes too big ?

Lookup…

•  get(“elevator”)– Visit8nodes–  Readthedatafromdisk

8

E L

A S

V E A

S

O T

O

M

R Data: 42

Whyelevator?

•  Americanelevatorsarewayyyytoofast!•  FeelslikeaNASAtraining

Makesmewanttothrowup:(

9

Hardwarelimit!

•  Onverygoodhardware!

10

Up to 9 RANDOM I/Os per lookup !

1 second / (9 * 3ms) = 37 get / sec

Wedidthattoo:)

• Wehadsimilaralgorithms•  Encounteredhorribleperfdecreases…

11

REDUCE I/O !

Ensure only 1 I/O per lookup.

Don’ttrustbenchmarks

•  Afewmillionentriesisn’tenough•  You’rebenchmarkingyourDisk/OScache:)

12

(Common, DON’T BENCHMARK IN RUBY…)

Comparewhat’scomparable

•  Distributedcolumnstores–  BigTablelikesystems

•  Keyvaluestores–  Tokyo,Dynamolikesystems

13

Distributedcolumnstores

14

Hypertable

Cassandra

SimpleDB

HBase Distributed

Replicated

Persistent

Eventually Persistent

Mature

Google Megastore

Key–valuestores

15

Mutable

Embedded Distributed

Replicated Persistent

Eventually Persistent

Immutable

Tokyo Cabinet

Redis

Scalaris

Voldemort MemcacheDB

MongoDB

Persevere BerkeleyDB

CouchDB

Dynomite

Mature

Howtochoose?

• Maturityispriceless• Mostsuitablestores:

–  Persistent:BerkeleyDB,MySQL

–  Ev.Persistent:TokyoCabinet–  Ev.Persistent+Distributed+…:Voldemort– Distributedcolumnstore:Cassandra

16

YouarenotGoogle

•  Buildsomething• Makeitwork

•  Thinkaboutscaling•  Thinkaboutbeinghyped

17

(Well, not all of you )

Why ?

Keyvaluepain…

•  Youwillendupcrossingdata.•  DoingjoinswithKVstores?

18

Only for ninjas !

Query8mejoins

•  CodingwhatRDBMSaremadefor?…•  Slow!

19

(you shouldn’t do this…)

Build8meschemaflaTening

•  Notflexible!•  NeedsMapReduce(Hadoop…)toscale

20

Think before doing this :)

Andfinally..

•  HRDharddrive:160000randomIO/sec!!!•  YahooannouncedopensourcingSherpa!•  Resources

–  hip://developer.yahoo.net/blog/archives/2009/06/nosql_meetup.html

–  hip://metabrew.com/ar&cle/an&‐rdbms‐a‐list‐of‐distributed‐key‐value‐stores/

–  hip://www.ryanpark.org/2008/04/top‐10‐avoid‐the‐simpledb‐hype.html

–  hip://project‐voldemort.com/blog/2009/06/building‐a‐1‐tb‐data‐cycle‐at‐linkedin‐with‐hadoop‐and‐project‐voldemort/

21

Thanks ! Contact: @ahfeel :)

top related