wmware nosql

41
© 2011 VMware Inc. All rights reserved NoSQL / Spring Data Polyglot Persistence – An introduction to Spring Data Pronam Chatterjee [email protected]

Upload: murat-cakal

Post on 10-May-2015

961 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Wmware NoSQL

© 2011 VMware Inc. All rights reserved

NoSQL / Spring Data

Polyglot Persistence – An introduction to Spring Data

Pronam Chatterjee

[email protected]

Page 2: Wmware NoSQL

2

Presentation goal

How Spring Data simplifies the

development of NoSQL

applications

Page 3: Wmware NoSQL

3

Agenda

• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs

- MongoDB- HyperSQL- Neo4J

Page 4: Wmware NoSQL

4

Relational databases are great

• SQL = Rich, declarative query language• Database enforces referential integrity• ACID semantics• Well understood by developers• Well supported by frameworks and tools, e.g. Spring JDBC,

Hibernate, JPA• Well understood by operations• Configuration • Care and feeding• Backups• Tuning• Failure and recovery• Performance characteristics

• But….

Page 5: Wmware NoSQL

5

The trouble with relational databases

• Object/relational impedance mismatch- Complicated to map rich domain model to relational schema

• Relational schema is rigid- Difficult to handle semi-structured data, e.g. varying attributes- Schema changes = downtime or $$

• Extremely difficult/impossible to scale writes:- Vertical scaling is limited/requires $$- Horizontal scaling is limited or requires $$

• Performance can be suboptimal for some use cases

Page 6: Wmware NoSQL

6

NoSQL databases have emerged…

Each one offers some combination of:

•High performance

•High scalability

•Rich data-model

•Schema less

In return for:

•Limited transactions

•Relaxed consistency

•…

Page 7: Wmware NoSQL

7

… but there are few commonalities

• Everyone and their dog has written one

• Different data models- Key-value- Column- Document- Graph

• Different APIs – No JDBC, Hibernate, JPA (generally)

• “Same sorry state as the database market in the 1970s before SQL was invented” http://queue.acm.org/detail.cfm?id=1961297

Page 8: Wmware NoSQL

8

• NoSQL usage small by comparison…

• But growing…

NoSQL databases have emerged…

Page 9: Wmware NoSQL

10

Agenda

• Why NoSQL?• Overview of NoSQL databases• Introduction to Spring Data• Database APIs

- MongoDB- HyperSQL- Neo4J

Page 10: Wmware NoSQL

11

Redis

• Advanced key-value store- Think memcached on steroids (the good kind)- Values can be binary strings, Lists, Sets, Ordered Sets, Hash maps, ..- Operations for each data type, e.g. appending to a list, adding to a

set, retrieving a slice of a list, …- Provides pub/sub-based messaging

• Very fast:- In-memory operations- ~100K operations/second on entry-level hardware

• Persistent- Periodic snapshots of memory OR append commands to log file- Limits are size of keys retained in memory.

• Has “transactions”- Commands can be batched and executed atomically

K1

K2

K3

V1

V2

V2

Page 11: Wmware NoSQL

14

Redis use cases

• Use in conjunction with another database as the SOR

• Drop-in replacement for Memcached- Session state- Cache of data retrieved from SOR- Denormalized datastore for high-performance queries

• Hit counts using INCR command

• Randomly selecting an item – SRANDMEMBER

• Queuing – Lists with LPOP, RPUSH, ….

• High score tables – Sorted sets

Notable users: github, guardian.co.uk, ….

Page 12: Wmware NoSQL

14

vFabric Gemfire - Elastic data fabric

• High performance data grid

• Enhanced parallel disk persistence

• Non Disruptive up/down scalability

• Session state- Cache of data retrieved from SOR

- Denormalized datastore for high-performance queries

• Heterogenous data sharing

•Java

• .net

•C++

• Co-located Transactions

Page 13: Wmware NoSQL

14

Gemfire - Use Cases

• Ultra low latency high throughput application

• As an L2 cache in hibernate

• Distributed Batch process

• Session state- Tomcat

- tcServer

• Wide Area replication

Page 14: Wmware NoSQL

15

Neo4j

•Graph data model

- Collection of graph nodes- Typed relationships between nodes- Nodes and relationships have properties

•High performance traversal API from roots

- Breadth first/depth first

•Query to find root nodes

- Indexes on node/relationship properties- Pluggable - Lucene is the default

•Graph algorithms: shortest path, …

•Transactional (ACID) including 2PC

•Deployment modes

- Embedded – written in Java- Server with REST API

Page 15: Wmware NoSQL

16

Neo4j Data Model

Page 16: Wmware NoSQL

19

Neo4j Use Cases

• Use Cases

- Anything social- Cloud/Network management, i.e. tracking/managing physical/virtual resources- Any kind of geospatial data- Master data management- Bioinformatics- Fraud detection- Metadata management

• Who is using it?

- StudiVZ (the largest social network in Europe)- Fanbox- The Swedish military- And big organizations in datacom, intelligence, and finance that wish to remain anonymous

Page 17: Wmware NoSQL

20

MongoDB

• Document-oriented database- JSON-style documents: Lists, Maps, primitives- Documents organized into collections (~table)

• Full or partial document updates- Transactional update in place on one document- Atomic Modifiers

• Rich query language for dynamic queries

• Index support – secondary and compound

• GridFS for efficiently storing large files

• Map/Reduce

Page 18: Wmware NoSQL

21

Data Model = Binary JSON documents

• Sequence of bytes on disk = fast I/O- No joins/seeks

- In-place updates when possible => no index updates

• Transaction = update of single document

{

"name" : "Ajanta",

"type" : "Indian",

"serviceArea" : [

"94619",

"94618"

],

"openingHours" : [

{

"dayOfWeek" : Monday,

"open" : 1730,

"close" : 2130

}

],

"_id" : ObjectId("4bddc2f49d1505567c6220a0")

}

{

"name" : "Ajanta",

"type" : "Indian",

"serviceArea" : [

"94619",

"94618"

],

"openingHours" : [

{

"dayOfWeek" : Monday,

"open" : 1730,

"close" : 2130

}

],

"_id" : ObjectId("4bddc2f49d1505567c6220a0")

}

One document =

one DDD aggregate

One document =

one DDD aggregate

Page 19: Wmware NoSQL

23

MongoDB query by example

• Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday

{

serviceArea:"94619",

openingHours: {

$elemMatch : {

"dayOfWeek" : "Monday",

"open": {$lte: 1800},

"close": {$gte: 1800}

}

}

}

{

serviceArea:"94619",

openingHours: {

$elemMatch : {

"dayOfWeek" : "Monday",

"open": {$lte: 1800},

"close": {$gte: 1800}

}

}

} DBCursor cursor = collection.find(qbeObject);

while (cursor.hasNext()) {

DBObject o = cursor.next();

}

DBCursor cursor = collection.find(qbeObject);

while (cursor.hasNext()) {

DBObject o = cursor.next();

}

Page 20: Wmware NoSQL

25

MongoDB use cases

• Use cases

- Real-time analytics- Content management systems - Single document partial update- Caching- High volume writes

• Who is using it?

- Shutterfly, Foursquare- Bit.ly Intuit- SourceForge, NY Times- GILT Groupe, Evite, - SugarCRM

Copyright (c) 2011 Chris Richardson. All rights reserved.

Page 21: Wmware NoSQL

26

Other NoSQL databases

• SimpleDB – “key-value”

• Cassandra – column oriented database

• CouchDB – document-oriented

• Membase – key-value

• Riak – key-value + links

• Hbase – column-oriented…

http://nosql-database.org/ has a list of 122 NoSQL databaseshttp://nosql-database.org/ has a list of 122 NoSQL databases

Page 22: Wmware NoSQL

27

Agenda

• Why NoSQL?

• Overview of NoSQL databases

• Introduction to Spring Data

• Database APIs

- MongoDB

- HyperSQL

- Neo4J

Page 23: Wmware NoSQL

28

NoSQL Java APIs

But

•Usage patterns

•Tedious configuration

•Repetitive code

•Error prone code

•…

Database Libraries

Redis Jedis, JRedis, JDBC-Redis, RJC

Neo4j Vendor-provided

MongoDB Vendor-provided Java driver

Gemfire Pure Java map API, Spring-Gemfire templates

Page 24: Wmware NoSQL

30

Spring Data Project Goals

• Bring classic Spring value propositions to a wide range of NoSQL databases:

- Productivity- Programming model consistency: E.g. <NoSQL>Template classes- “Portability”

Page 25: Wmware NoSQL

31

Spring Data sub-projects

• Commons: Polyglot persistence

• Key-Value: Redis, Riak

• Document: MongoDB, CouchDB

• Graph: Neo4j

• GORM for NoSQL

http://www.springsource.org/spring-datahttp://www.springsource.org/spring-data

Page 26: Wmware NoSQL

32

Many entry points to use

• Auto-generated repository implementations

• Opinionated APIs (Think JdbcTemplate)

• Object Mapping (Java and GORM)

• Cross Store Persistence Programming model

• Productivity support in Roo and Grails

Page 27: Wmware NoSQL

33

Cloud Foundry supports NoSQL

MongoDB and Redis are provided as services

è Deploy your MongoDB and Redis applications in seconds

Page 28: Wmware NoSQL

34

Agenda

• Why NoSQL?

• Overview of NoSQL databases

• Introduction to Spring Data

• Database APIs

- MongoDB

- HyperSQL

- Neo4J

Page 29: Wmware NoSQL

35

Three databases for today’s talk

Document database

Relational database

Graph database

Page 30: Wmware NoSQL

36

Three persistence strategies for today’s talk

• Lower level template approach

• Conventions based persistence (Hades)

• Cross-Store persistence using JPA and a NoSQL datastore

Page 31: Wmware NoSQL

37

Spring Template Patterns

• Resource Management

• Callback methods

• Exception Translation

• Simple Query API

Page 32: Wmware NoSQL

38

Repository Implementation

Page 33: Wmware NoSQL

39

• Also known as HSQLDB or Hypersonic SQL

• Relational Database

• Table oriented data model

• SQL used for for queries

• … you know the rest…

Page 34: Wmware NoSQL

40

Spring Data Repository Support

• Eliminate bolierplate code – only finder methods

• findByLastName – Specifications for type safe queries

• JPA CrietriaBuilder integration QueryDSL

Page 35: Wmware NoSQL

41

•Type safe queries for multiple backends including JPA, SQL and MongoDB in Java

•Generate Query classes using Java APT

•Code completion in IDE

•Domain types and properties can be referenced safely

•Adopts better to refactoring changes in domain types

http://www.querydsl.com

Page 36: Wmware NoSQL

42

QueryDSL

• Repository Support

• Spring Data JPA

• Spring data Mongo

• Spring Data JDBC extensions

• QueryDslJdbcTemplate

Page 37: Wmware NoSQL

43

Spring Data Neo4J

• Using AspectJ support providing a new programming model

• Use annotations to define POJO entities

• Constructor advice automatically handles entity creation

• Entity field state persisted to graph using aspects

• Leverage graph database APIs from POJO model

• Annotation-driven indexing of entities for search

Page 38: Wmware NoSQL

44

Spring Data Graph Neo4J cross-store

• JPA data and “NOSQL” data can share a data model

• Separate the persistence provider by using annotations

– could be the entire Entity

– or, some of the fields of an Entity

• We call this cross-store persistence

– One transaction manager to coordinate the “NOSQL” store with the JPA relational database

– AspectJ support to manage the “NOSQL” entities and fields

• holds on to changed values in “change sets” until the transaction commits for non-transactional data stores

Page 39: Wmware NoSQL

45

A cross-store scenario ...

You have a traditional web app using JPA to persist data to a relational database ...

Page 40: Wmware NoSQL

46

JPA Data Model

8/3/11 Slide 46

Page 41: Wmware NoSQL

47

Cross-Store Data Model