optimizing slow queries with indexes and creativity

Post on 15-Jan-2015

1.197 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Optimizing SlowQueries with Indexes

and Creativity

Chris Winslettchris@mongohq.com

My Background

For the past year, I’ve looked atMongoDB logs at least once every day.

We routinely answer the question “howcan I improve performance?”

Who’s this talk for?

New to MongoDB

Seeing some slow operations, and needhelp debugging

Running database operations on a sizeabledeploy

I have a MongoDB deployment, and I’vehit a performance wall

••

What should you learn?Know where to look on a running MongoDBto uncover slowness, and discuss solutions.

MongoDB has performance“patterns”.

How to think about improving performance.

And . . .

Schema Design

Design with the end in mind.

First, a Simple One

query getmore command res faults locked db ar|aw netIn netOut conn time 129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54 64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55 65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56 65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57 66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58 32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59

a truncated mongostat

Alerted due to high CPU

log

Example 1

[conn73454] query my_db.my_collection query: { $query: {publisher: "US Weekly" }, orderby: { publishAt: -1 } }ntoreturn:5 ntoskip:0 nscanned:33236 scanAndOrder:1keyUpdates:0 numYields: 21 locks(micros) r:317266nreturned:5 reslen:3127 178ms

Solution

Example 1

{ $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } }

db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true})

We are fixing this query

With this index

I would show you the logs, but now they are silent.

The Pattern

Example 1

Inefficient Read Queries from in-memory table scans cause high CPU load

Caused by not matching indexes to queries.

Example 2

query delete res faults locked db idx miss % qr|qw ar|aw netIn netOut conn 25 6 346m 0 my_db:188.1% 0 0|0 0|2 25k 45k 117 24 6 346m 0 my_db:188.6% 0 0|0 0|1 27k 44k 117 24 6 346m 0 my_db:184.3% 0 0|0 0|1 21k 36k 117 24 6 346m 0 my_db:190.9% 0 0|0 0|1 20k 33k 117 19 4 346m 0 my_db:191.5% 0 0|0 0|0 21k 41k 117

a truncated mongostat

tail

[conn72593] remove my_db.my_collectionquery: { status: "some chuck of text" }keyUpdates:0 numYields: 15 locks(micros)w:213415 210ms

Example 2

Solution

Example 2

db.my_collection.remove({status: “some chunk of text”})

db.my_collection.ensureIndex({status: 1})

This is the slow query

With this index

The Pattern

Example 2

Inefficient write queries cause high lock.

Caused by losing track of your indexes / queries.

Example 3

insert query update delete getmore command faults locked % idx miss % qr|qw ar|aw *0 *0 *0 *0 0 1|0 1422 0 0 0|0 50|0 *0 6 *0 *0 0 6|0 575 0 0 0|0 51|0 *0 3 *0 *0 0 1|0 1047 0 0 0|0 50|0 *0 2 *0 *0 0 3|0 1660 0 0 0|0 50|0

a truncated mongostat

Alerted on high CPU

tail

[initandlisten] connection accepted from ....[conn4229724] authenticate: { authenticate: ....[initandlisten] connection accepted from ....[conn4229725] authenticate: { authenticate: .....[conn4229717] query ..... 102ms[conn4229725] query ..... 140ms

amazingly quietExample 3

currentOp> db.currentOP(){ "inprog" : [ { "opid" : 66178716, "lockType" : "read", "secs_running" : 760, "op" : "query", "ns" : "my_db.my_collection", "query" : {keywords: $in: [“keyword1”, “keyword2”],tags: $in: [“tags1”, “tags2”] },orderby: {“created_at”: -1}, "numYields" : 21 }]}

Example 3

Solution

> db.currentOP().inprog.filter(function(row) { return row.secs_running > 100 && row.op == "query" }).forEach(function(row) { db.killOp(row.opid) })

Example 3

Return Stability to Database

Disable query, and refactor schema.

Refactoring

I have one word for you,“Schema”

Example 4

A map reduce has gradually runslower and slower.

Finding Offenders

Example 4

Find the time of the slowest query of the day:grep '[0-9]\{3,100\}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n

Slowest Map Reduce

Example 4

my_db.$cmd command: { mapreduce: "my_collection", map: function() {}, query: { $or: [ { object.type: "this" }, { object.type: "that" } ],time: { $lt: new Date(1359025311290), $gt: new Date(1358420511290) }, object.ver: 1, origin: "tnh" },out: "my_new_collection", reduce: function(keys, vals) { ....}} ntoreturn:1 keyUpdates:0 numYields: 32696 locks(micros) W:143870 r:511858643 w:6279425 reslen:140 421185ms

Solution

Query is slow because it has multiple multi-value operators: $or, $gte, and $lte

Example 4

Problem

SolutionChange schema to use an “hour_created” attribute:

hour_created: “%Y-%m-%d %H”

Create an index on “hour_created” with followed by “$or” values. Queryusing the new “hour_created.”

Words of caution

2 / 4 solutions were to add an index.

New indexes as a solution scales poorly.

Sometimes . . .

It is best to do nothing, except add shards / addhardware.

Go back to the drawing board on the design.

Bad things happen togood databases?

ORMs

Manage your indexes and queries.

Constraints will set you free.

•••

Road Map forRefactoring

Measure, measure, measure.

Find your slowest queries and determineif they can be indexed

Rephrase the problem you are solving byasking “How do I want to query by data?”

••

Thank you!

Questions?

E-mail me: chris@mongohq.com

••

top related