puppetdb: one year faster - puppetconf 2014
DESCRIPTION
PuppetDB: One Year Faster - Deepak Giridharagopal, Puppet LabsTRANSCRIPT
PuppetDB (ノ°ヮ°)ノ*:・゚✧2014✧
RUNPDB
deepak giridharagopal deepak @puppetlabs.com
@grim_radical
Puppet agent
Puppet master
facts
PuppetDBPuppetDB
Puppet agent
Puppet master
facts
PuppetDBPuppetDB
Puppet agent
Puppet master PuppetDBfacts PuppetDB
Puppet agent
Puppet master
facts
PuppetD
B
PuppetDB
Puppet agent
Puppet master
facts
PuppetDBPuppetDB
Yum!
Puppet agent
Puppet master PuppetDBPuppetDB
Puppet agent
Puppet master PuppetDBPuppetDB
catalogcatalog
catalog
Puppet agent
Puppet master PuppetDBPuppetDB
catalog
catalog
Puppet agent
Puppet master
PuppetD
B
PuppetDB
catalog
Puppet agent
catalog
Puppet master
facts
PuppetDBPuppetDB
Yum!
Puppet agent
Puppet master PuppetDBPuppetDB
catalog
Puppet agent
Puppet master PuppetDBPuppetDB
report
Puppet agent
Puppet master PuppetDBPuppetDB
report
report
Puppet agent
Puppet master PuppetDBPuppetDB
report
Puppet agent
Puppet master
PuppetD
B
PuppetDB
Puppet agent
Puppet master
facts
PuppetDBPuppetDB
Yum!
Puppet agent
Puppet master PuppetDBPuppetDB
Puppet master
catalog
PuppetDB
catalog
Puppet master
catalogPuppetDB
catalog
Puppet master PuppetDB
catalog
catalog
Puppet master PuppetDB
catalog
Puppet master PuppetDB
catalog
catalog
Puppet master PuppetDB
catalog
Puppet master PuppetDBcatalog
catalog
Puppet master PuppetDBcatalog
Software should be self-regulating
goo.comfoo.com bar.com
STOR
AGE
bar.com
Self-regulating catalog & fact storage!
goo.comfoo.com bar.com
STOR
AGE
bar.combar.combar.combar.com
Self-regulating report storage!
goo.comfoo.com bar.com
STOR
AGE
Self-regulating node storage!
goo.comfoo.com bar.com
STOR
AGE
bar.com
Deduplication of catalogs!foo.com goo.com
/commands MQ Parse
Delayed
Dead Letter Office
Process
UUID
Querying
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources/Service
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/resources/Service/foo
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources /Package
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
baz.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
goo.com
/v4/nodes/foo.com/resources /Package/foo
• Can query facts, nodes, resources, reports, events, metrics
• Advanced queries via AST-based language
• Aggregates, ordering, paging, streaming, subqueries
A long time ago in a galaxy far, far away…
!
!
uh, like here a year ago…
Just released 1.4.0, Around 11k deployments, Basic streaming support, Other stuff…
We’ve been busy!
Soft write failures Paging support
Resource containment paths in events
Event aggregates Differential fact storage
Differential edge storage Pervasive streaming of query
results 3dfx Voodoo2 and
Soundblaster AWE32 support Improved de-duplication
Resource parameter caching PostgreSQL Hot Standby support for faster reads
Debugging of de-duplication algorithm
Full director’s commentary
Compressed responses A pony
Certificate chain support Support for puppet
environments Event subqueries
Prepared statement caching Direct POST of json data in
terminus Brings Aeris from Final Fantasy VII back to life
Profiling support for terminus
Faster message parsing Structured fact storage and
querying Unified query subsystem in v4
API Riker’s beard
docs.puppetlabs.com/puppetdb/latest/
release_notes.html
We can’t get through that entire list, but I’ll try to highlight a few shiny
bits
1. Differential storage
goo.comfoo.com bar.com
STOR
AGE
bar.comfoo.com
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
bar.com
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[zzz]
Exec[echo foo]
Exec[echo bar]
• Edges and facts, too • Trades reads for writes,
which is a good tradeoff. • PostgreSQL’s heap-only-
tuples help a lot.
~90% fewer writes
Thanks to the folks at Spotify, the community, etc. that helped us with
this!
2. More effective de-duplication
Order matters!
{"foo" => "goo", "bar" => "baz"}
{"bar" => "baz", "foo" => "goo"}
904d4d…
11c05d…
• Restructuring data prior to hashing results in much fewer false negatives
• The fastest way to persist data is to already have it persisted!
~60-70% boost for users with ordinarily
low dedupe rates
Thanks to the folks at CERN that helped us
debug this!
3. Hot standby
PuppetDB PostgreSQL
WRITE
READ
PuppetDB
PostgreSQL
WRITE
READ
PostgreSQLstandby
REPLICATION
Less I/O contention for reads and writes yields
better throughput
3. Environment support
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
STORAGEFile[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
All Files, please!
Anything for you! !
STORAGE
File[/foo]
File[/bar] Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
WTF?!
I’m trying my best! !
STORAGE
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
All Files forenv TEST, plz!
Anything for you! !
File[/foo]
STORAGE
File[/foo]
File[/bar]
Package[foo]
Package[bar]
Service[foo]
Service[bar]
Exec[echo foo]
Exec[echo bar]
foo.comTEST
bar.com
PROD
Thanks!
We’re friends again! ❤️
• Any reception or transmission of data now includes the environment where possible
• Queries can be isolated to a single environment
3. Unified query engine
PuppetDB
Parse Map valid fields
Map valid operators
Compile to SQL
PostgreSQL
Parse Map valid fields
Map valid operators
Compile to SQL
Parse Map valid fields
Map valid operators
Compile to SQL
Parse Map valid fields
Map valid operators
Compile to SQL
Facts
Resources
Nodes
Reports
Parse Map valid fields
Map valid operators
Compile to SQLEvents
QUERY
PuppetDB
Parse to AST Term rewriting
Apply operators
Compile to SQL
PostgreSQL
QUERY
• Common query engine underpinning v4 API
• Operators are available uniformly across all v4 endpoints
• We can add new endpoints and fields much faster
4. Structured/Trusted
fact support
{ "cpus" : { "cpu1" : { "bogomips": 6000, } }, "networking" : { "eth0" : { "ipaddresses" : [ "1.1.1.5" ], "macaddresses" : [ "aa:bb:cc:dd:ee:00" ] } } }
["=", "path", ["networking", "eth0", "macaddresses", 0]]
["~>", "path", ["networking", "eth.*", "macaddresses", ".*"]]
!
• PostgreSQL’s pg_trgm index • Trusted facts
So where are we now?
More features, but also more speed
Language bindings: Ruby, Python,
JavaScript, C/C++, Go, Clojure, Java…
Trapperkeeper https://github.com/puppetlabs/trapperkeeper
Puppetboard https://github.com/nedap/puppetboard
Puppet Explorer https://github.com/spotify/puppetexplorer
A new deployment every 10 minutes
Coming soon: more efficient GC
Coming soon: simplified query syntax
Coming soon: historical data
Thanks!