couchconf-sf-designing-couchbase-documents
TRANSCRIPT
Designing Couchbase Documents
Benjamin Young @bigbluehat
2
SCHEMA-‐LESS DATABASE
• ad hoc data store– no need to define schema before adding data
• document structure comes into play– but at the query level, not at the entry level
3
HOWEVER!There are constraints.
4
INHERENT SCHEMAsort of
5
UNIQUENESS
• Document ID is the only (DB-‐side) way to make something unique– UUID’s don’t cut it for this
• App could de-‐dup from map/reduce– but that can be tricky
• Be prepared to handle conflicLng IDs
6
ONE DOC OR MULTIPLE DOCS?
7
DECISION MAKERS
• what does this document look like in real life?• how oPen will I update this?• does this need its own revision/transacLon path?– does all this data need updaLng together?– or rolled back together?
• Side Note:– revisions should never be used for versioning– compacLon will remove them, and you’ll be sad
JSON DOCUMENTS
8
{“json”: “key / value pairs”,“_id”: “some uuid”,“_rev”: “mvcc key”,“string keys” : [1, 2, 3, “four”, null],“schema free” : true
}
9
KEY NAMES
• JSON Object restricLons– they’re all strings
• Couchbase reserves these prefixes on top-‐level keys– “_” underscore -‐ also reserved by CouchDB– “$” dollar signs
• Consider how you’ll be using it in your app– template system constraints?– objects vs. arrays
10
VALUES
• JSON restricLons–objects, arrays, strings, numbers
• Be careful of numbers as strings– running _stats (or _sum) on strings will ruin your day–might use Number() if you’re unsure
• Date formats–unix Lmestamps–output as an array for grouping reducLons
11
QUERYINGcan I get at the doc’s data easily?
12
UPDATING
• When things change, do I want to update the doc?– or put in a new doc and “collapse” things later• the accounLng model
• Frequently (re)wrigen docs might make replicaLon harder
13
REPLICATION
• The biggie!• Avoid conflicts (if possible)• Leverage small pieces where possible/sensible• Keep uniqueness and conflicts in balance
14
TOOLS
15
VALIDATE_DOC_UPDATE
• funcLon(newDoc, storedDoc, userCtx)• opLonally enforced schema• throw errors to prevent save• cannot modify newDoc• can enforce field types, values• can prevent docs or fields from being updated again (created_at, user)• runs every Lme a document is updated– even during replicaLon
16
?INCLUDE_DOCS=TRUE
• super handy for “joining” map/reduce results• can help you “accept” using mulLple, smaller docs
SAMPLE DOCS (IN 2.0)HANDY FOR QUICK DOC “SCHEMA” REFERENCING
17
18
MORE TOOLS
• Update handlers• Output funcLons– _show/{show_funcLon_name}/{doc_id}• runs a single doc through an addiLonal “display” funcLon
– _list/{list_funcLon_name}/{view_name}• same as _show, but for map/reduce results
• research these later– or join me in the Lounge aPer this talk
19
CONVENTIONS & GOOD HABITS
• “type”: “contact”• “created_at”: Lmestamp• “status”: some status for this doc (published)• “tags”: [“couch”, “db”, “nosql”]
20
MORE CONVENTIONS
• “created_by”: username (from _users typically)• “profile”: CouchApp profile contents (from _users) stored on doc for convenience
21
EXAMPLES
22
BLUEINK
• “page” documents reference ID’s (UUIDs) in various page areas• map/reduce aggregate general page data, site wide serngs, the chosen template, and page items• _list applies the template, builds final page, all in one GET
23
BLUEINK (CONT)
• content item docs (type “html”, “contact”, etc) hold content separate from doc for independent updates and reuse• other special docs use non-‐UUID’s: site, sitemap, template docs
24
BLUEINK PAGEpage document
page area page area
page area
content item
content item
content item
content item
1
1
2
1
2
3
2
content item
1
• page document contains:page_items key in which is a mulL-‐dimensional array containing page areas and the associated content items
COMPLETE PAGE DOC
• {"_id": "a9c276de2a064836ab306b095f000f8a",• "_rev": "174-‐5cf651f7b944b1a352bc10103e018652",• "type": "page"• "Ltle": "Home", "nav_label": "Home", "url": "home",• "page_items": [• [ { "_id": "13a54b6e52123745cced243d620003e0", "display_Ltle": true},
• {"_id": "8dd982de76e8b5959e10e6d4360067ce", "display_Ltle": false},
• {"_id": "8dd982de76e8b5959e10e6d43600709a", "display_Ltle": true}],
• [ {"_id": "90cb972de2a11045be18a3a88c001bad"} ] ] }25
PAGE ITEMS SECTIONOF PREVIOUS “PAGE” DOC
26
{"page_items": [ [ { "_id": "13a54b6e52123745cced243d620003e0", "display_Ltle": true}, { "_id": "8dd982de76e8b5959e10e6d4360067ce", "display_Ltle": false} ], [ { "_id": "90cb972de2a11045be18a3a88c001bad"} ]] }
1
1
2
2
1
27
“HTML” ITEM DOC“INCLUDED” & FORMATTED VIA _LIST FUNCTION
• { "_id": "8dd982de76e8b5959e10e6d43600615d",• "_rev": "174-‐aa7546e6308324d4b3ad84469fcbd773",• "type": "html",• "created": "2008-‐06-‐18 15:03:45",• "updated": "2009-‐08-‐08 13:28:58",• "Ltle": "Welcome",• "content": "<p>The <a href=\"hgp://blueinkcms.com/\"><strong>BlueInk Content Management System (CMS)</strong></a> gives clients hassle-‐free control over their content. Through a simple interface you'll be able to edit and organize text, photos, contact informaLon, and other components-‐all while looking right at your website! The soPware is easy to learn, but don't take our word for it-‐sign up for a <a href=\"hgp://demo.blueinkcms.com/\">free demo</a>!</p>"}
MAP/REDUCE OUTPUT
28
29
ANY QUESTIONS?
• catch me in the lounge• or online:• @bigbluehat• bigbluehat on IRC (freenode)• [email protected]