mongodc 2012: how mongodb powers doodle or die
DESCRIPTION
Mongo or Die! How MongoDB powers Doodle or DieTRANSCRIPT
Aaron Silverman
Mongo or Die!How MongoDB powers Doodle or Die
(@Zugwalt)
Doodle or Die
@DoodleOrDie
What is Doodle or Die?
Telephone
I'd like some beer!
I'd like some deer
I'd like some deer
I'd like some deer
Phrase
Phrase
Phrase
Phrase
Doodle or Die
Shining Apple
Doodle
Phrase
Phrase Eat your fruit or DIE!
Doodle
What Powers Doodle or Die?
Started very, very small
4 Cores128MB RAM
Node Server
MongoDB Server
Small - 2GB
Got serious about our servers
MongoDB Server8 Cores256MB RAM
Node Server
Large - 5GB
Called in some Reinforcements
MongoDB Server12 Cores1GB RAM
Node Server
In the last 30 Days:
• 2,500,000 page views
• 100,000 uniques
• 35,000 active player accounts
• 2,000,000 new doodles and descriptions
Data InformationMongo DB3.5 GB data size0.5 GB index size~10 queries/sec
Amazon170 GB data size8 GB in/month200 GB out/month
MongoHQ: $67• Amazon S3: $70
• Player Info• Chain Info (minus
doodles)• Group Info• Game State• Logs
• Doodles!• Static Content• Compressed
Database Backups
MongoDB - $65 / month
Daily backups to S3: $16/mo
Amazon S3 - $70 / month
Node - $62/ month
Total Cost To Host: $197/monthMongoDB $65Amazon S3 $70Node $62
Total $197
PAAS Provides Easy Upgrade Path
General Principles
Custom _idPartially random string generated using ShortId node module
ObjectIdObjectId("4fd02d5d78315a502d15cdde")ObjectId("4fd02d5a78315a502d15cddd")ObjectId("4fd02d5878315a502d15cddc")
ShortId"8rOIwh2VD""1qyY61Lu1""5GQnbx-1"
• Shorter, less cumbersome in code / queries
db.players.findOne({_id: ‘58mwYlTKV’});
db.chains.update({_id: ‘58mwYlTKV’}, {$set: activePlayer_id: ‘88ueYaL6V’});
http://doodleordie.com/c/5ONtvvSGH
• Randomness will be better for sharding and makes it harder to cheat
<span class="doodle" data-jsonp="http://doodles.s3.amazonaws.com/d2/Eh8-Po2R5/1Em5kj3LY.js">
Question Oriented Subdocuments
What chain is this player working on right now?
What are this player’s stats?
Which players are not eligible to be assigned this chain?
Goal is for most “Questions” to be able to be answered in one query from one sub document
Related “Questions” will share common ancestors
db.players.findOne({_id: ‘58mwYlTKV’}, {‘game.recentSkips’: 1});
db.players.findOne({_id: ‘58mwYlTKV’}, {game: 1});
Indexes are designed to make answering questions easy!
What chain is this player working on right now?
db.players.ensureIndex({‘game.activeChain_id’: 1});
What chains are recently awaiting a new doodle?
db.chains.ensureIndex({inUse: -1, activeState: 1 lastModified" : -1});
Doodle or Die Collections
players
chains
groups
sessions
log
Primary
Support
players
account
info
login
chainHistory
game
stats QueryFrequency
Often
Rarely
players
account
info
login
chainHistory
stats
game • activeState• activeChain_id• activeStepIndex• recentSkips
Answers question: “What is this player working on right now?”db.players.findOne({_id: ‘58mwYlTKV’}, {game: 1});
players
account
info
login
stats
game
chainHistory
Chain1• datePlayed• dateViewed
ChainN
Answers the question: “What has the player worked on?”db.players.findOne({_id: ‘58mwYlTKV’}, {chainHistory: 1});
players
account
info
login
game
chainHistory
Answers the question: “What has the player worked on?”
stats
• totalSteps• drawSteps• phraseSteps• numSkips• numLikes
db.players.findOne({_id: ‘58mwYlTKV’}, {stats: 1});
chains
• activePlayer_id• activeState• numSteps• lastModified
ineligiblePlayer_idssteps
chains are searched (not as many questions)db.chains.findOne({inUse: false, activeState : player.game.activeState, ineligiblePlayer_ids: {$ne: player._id}, lastModified: { $gte: timeRange}});
chains
• activePlayer_id• activeState• numSteps• lastModified
*
steps
ineligiblePlayer_ids
[player_id1 player_id2, player_id3 … player_idN]
Note: $addToSet and $pull work great in maintaining this array
chains
• activePlayer_id• activeState• numSteps• lastModified
*ineligiblePlayer_ids
steps
[
]
Step1• player_id• state• date
StepN
content
• phrase
Description Step Content:
Doodle Step Content:• url (points to S3)• time• numStrokes• numDistinctColors
How Does it all Work?
players queried for users who are associated with the authenticated twitter account
chain history used to load thumbnails, complete active and previous chain loaded
db.players.find({‘login.twitter.uid’: ‘XXXXXXX’}, {game: 1, chainHistory: 1});
db.chains.find({_id: {$in: [player.game.activeChain_id, player.game.lastChain_id]}});
Chains collection searched and updated for unclaimed eligible chain to be given to player
db.chains.update({inUse: false, activeState : player.game.activeState, ineligiblePlayer_ids: {$ne: player._id}, lastModified: { $gte: timeRange}}, {$set: {inUse: true, activePlayer_id: player._id, $addToSet: {ineligiblePlayer_ids: player._id}});
db.chains.find({activePlayer_id: player._id});
Content saved to chain
db.chains.update({_id: chain._id, activePlayer_id: player._id}}, {$inc: {numSteps: 1}, $set: inUse: false, lastModified: datePlayed} $unset: {activePlayer_id: 1}, $push: {steps: { player_id: player._id, state: chain.activeState, content: content, date: datePlayed}}});
Player updated and given new state
db.players.update({_id: player._id, 'game.activeChain_id': chain._id}, {$inc: {‘stats.totalSteps’: 1}, $set: {'game.activeState': nextState,
'game.lastChain_id’: chain._id}, $unset: {'game.activeChain_id': 1, 'game.activeStepIndex': 1}});
Active chain and chain history reloaded
Doodle strokes will be saved to S3(but url to S3 and metadata saved to chain)
{"color":"#000000","size":15,"path":[307,66,308,66,308,68,308,69,308,70,308,71,308,72,308,73,306,76,305,79,305,81,302,83,302,84,302,85,302,86,301,87,300,89,300,90,300,91,300,92,300,95,300,97,300,102,300,103,300,104,300,108,300,109,300,110,301,114,303,116,304,116,305,119,305,121,307,124,309,127,310,130,310,131,313,132,314,133,316,137,317,138,320,140,321,140,323,143,326,143,328,143,333,144,337,144,341,144,343,144,347,143,352,143,354,141,357,140,358,140,359,139,362,138,363,138,365,137,368,135,369,132,370,131,371,130,374,128,375,128,376,125,378,125,379,124,380,123,381,123,382,120,385,119,386,118,388,115,391,112,394,109,394,107,394,106,397,104,397,103,397,102,397,101,397,100,397,98,397,96,397,94,397,93,397,91,397,90,397,88,397,87,397,86,397,83,395,80,395,79,395,77,394,74,393,72,393,70,393,67,392,65,391,63,389,62,388,59,386,57,383,54,383,52,381,49,379,48,378,47,376,46,375,44,374,43,372,43,370,42,369,42,368,42,364,41,362,41,359,41,355,41,351,42,349,42,347,42,346,44,343,44,342,45,339,46,337,47,336,48,333,49,332,51,330,51,328,51,327,52,326,53,323,53,322,53,321,54,320,55,317,55,316,56,314,58,309,59,306,61,306,62,305,62,304,63]}
Player’s chainHistory retrieved and filtered/sorted on server; applicable chains retrieved
db.chains.find({$in: chain_idArr});
db.players.find({‘login.urlSafe’: urlSafeUid}, {chainHistory: 1});
Retrieved chains ordered based on pre-determined sorting.
Some Additional Doodle or Die Tricks
Build assumptions into queries
Eve tries to join despite being banned!
groups: { _id: ‘8rOIwh2VD’, members: [ {name: ‘Alice’, dateJoined: ‘2012-05-24’}, {name: ‘Bob’, dateJoined: ‘2012-05-25’}
] banned: [ {name: ‘Eve’, dateBanned: ‘2012-05-25’}
]}
Alice and Bob are in an awesome group
groups.update( {_id: ‘8rOIwh2VD’}, {$push: {members: {name: ‘Eve’, dateJoined: new Date()}
}, }, function(err) {
if (err) throw err;callback();
});
Bugs in our code fail to detect Eve’s trickery and the update query is run!
groups: { _id: ‘8rOIwh2VD’, members: [ {name: ‘Alice’, dateJoined: ‘2012-05-24’}, {name: ‘Bob’, dateJoined: ‘2012-05-25’}, {name: ‘Eve’, dateJoined: ‘2012-05-26’}
] banned: [ {name: ‘Eve’, dateRequested: ‘2012-05-25’}
]}
Blast! She got through! How can we help prevent this?
Bake in our assumptions!
groups.update( {_id: ‘8rOIwh2VD’, ‘members.name’: {$ne: ‘Eve’}, ‘banned.name’: {$ne: ‘Eve’}}, {$push: {members: {name: ‘Eve’, dateJoined: new Date()}
}, }, function(err, updateCount) { if (err) throw err; if (updateCount !== 1) throw new Error(‘bugs!’); callback(updateCount === 1); });
Always Specify Fields
To assign a player a new chain, we only need their _id and game information
players
account
info
login
chainHistory
stats
gamegame
To load a player’s profile content, we just need their history, stats, and profile info
players
account
info
login
chainHistory
stats
gamechainHistory
stats
info
We need to assign Bob a new chain, lets figure out what state he needs next
players.find({name: ‘Bob’});
This will fetch and send back the ENTIRE player object!
Lets specify that we only want our “game” subdocument
players.find({name: ‘Bob’}, {fields: [‘game’]});
Hooray! Much less to send over the series of tubes.
Some Pain Points
Less obvious database structure
mysql> DESC players;+----------------+---------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+---------+------+-----+---------+-------+| id | int(11) | NO | PRI | NULL | || activeStateId | int(11) | YES | | NULL | || activeStepId | int(11) | YES | | NULL | |+----------------+---------+------+-----+---------+-------+
Data integrity up to application
Mysql> ALTER TABLE players ADD CONSTRAINT fk_players_chains FOREIGN KEY (activeChainId) REFERENCES chains(id);
/* Oh no! This is going to mess things up! */
players.update({_id: ‘c58D4’}, {$set: {‘game.activeChain_id’: ‘bad_id’}});
No mature GUI query development tools
If we could start over would we still use MongoDB?
Absolutely!
• NoSQL in general is great for rapid prototyping
• Fantastic performance
• Intuitive language keeps it easy to run one-off queries
• Excellent (and now officially supported) Node driver
Questions?
@DoodleOrDie@Zugwalt