[cb16]...
TRANSCRIPT
ISAAC DAWSON,
AROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
VERACODE
15
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
ABOUT ME:
▸ Previously at @stake, Symantec (10 years)
▸ Moved into research role at Veracode, Inc. (6 years)
▸ Living in Japan for 12 years
▸ I <3
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
IT ALL STARTED IN 2012…
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
SECURITY HEADER SCANNING HISTORY
▸ All scanners use the Alexa Top 1 Million URLs
▸ Galexa (November 2012 - March 2014)
▸ Golexa (March 2014 - February 2016)
▸ Creeper v0-v1 (February 2016 - July 2016)
▸ Creeper v2 (July 2016 - …)
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
SUMMARY OF SYSTEMS & COMPONENTS
▸ Admin (x1) - Manages jobs
▸ Agents (x50) - Analyzes URLs
▸ DB Writers (x4) - Feeds analysis data into the DB & S3
▸ Database (x1) - PostgreSQL 9.5 DB
▸ NSQ - A message queue for URLs, reports and responses
▸ S3 - Stores serialized DOM and HTML/JS
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
THE MESSAGE QUEUE -NSQD, NSQLOOKUPD
▸ NSQ is an easy to deploy message queue
▸ JSON messages between all systems
▸ All agents point to Admin service running NSQLookupd
VERACODEAROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
HELPFUL NSQ FEATURES
// Create consumerc.urlConsumer, err = nsq.NewConsumer(job.Topics["url"], creeper_types.UrlChannel, cfg)
// Process numBrowser of messages concurrently (7)c.urlConsumer.AddConcurrentHandlers( nsq.HandlerFunc(c.processUrls), numBrowsers)// Job taking too long to handle/process a message?msg.Touch() // notify we are still working on this message
// Need to requeue because chrome crashed?msg.RequeueWithoutBackoff(-1)
// Need to change max # of inflight messages?c.urlConsumer.ChangeMaxInFlight(c.getInflightCount())
1
2
3
4
VERACODE
DATA STORAGE
AROUND THE WEB IN 80 HOURS: SCALABLE FINGERPRINTING WITH CHROMIUM AUTOMATION
DATAFLOW
DBAGENT
ADMIN
WRITER
WRITER
WRITER S3
AGENT
AGENT
VERACODE
CREEPER AGENTS: GETTING THE DATA
BROWSER AUTOMATION REQUIREMENTS
▸ Automatable
▸ Fast
▸ Capture network
▸ Capture various browser events (CSP violations)
▸ Inject JavaScript
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHOSE CHROME, FOR OBVIOUS REASONS…
▸ Each agent runs 3-6 tabs concurrently
▸ Headless, uses Xvfb
▸ Can get full read access to network response data
▸ Easily inject javascript
▸ Can subscribe to console messages
VERACODE
CREEPER AGENTS: GETTING THE DATA
AGENT DESIGN
CREEPER AGENT
BROWSER MANAGER
ANALYZER
REPORTER
APP LOGIC
VERACODE
CREEPER AGENTS: GETTING THE DATA
GOOGLE CHROME REMOTE DEBUGGER
▸ Huge definition files: browser_protocol.json and js_protocol.json
{ "version": { "major": "1", "minor": "1" }, "domains": [{ "domain": "Inspector", "hidden": true, "types": [], "commands": [{ "name": "enable", "description": "Enables inspector domain...”, "handlers": ["browser", "renderer"] }], "events": [{ "name": "evaluateForTestInFrontend", "parameters": [ … ] }], }}
VERACODE
CREEPER AGENTS: GETTING THE DATA
GCD
▸ GCD generates Go code using templates
▸ Remote access to debugger events, functions, types.
▸ Can be updated easily as the protocol files change
VERACODE
CREEPER AGENTS: GETTING THE DATA
GCD WAS GOOD BUT…
▸ Needed something better
▸ Built autogcd to automate:
▸ Trapping console messages
▸ Intercepting network data
▸ Injecting JS
▸ Took some inspiration from WebDriver
VERACODE
CREEPER AGENTS: GETTING THE DATA
GETTING CSP EVENTS
func (b *Browser) StartIntercepting() error { b.tab.GetConsoleMessages(b.cspHandler()) return nil}
func (b *Browser) cspHandler() autogcd.ConsoleMessageFunc { return func(tab *autogcd.Tab, message *gcdapi.ConsoleConsoleMessage) { if message.Source != "security" { return } parseCsp(b.creeperData.CspResults, b.creeperData.ReportOnlyCspResults, message.Text) }}
1
2
VERACODE
CREEPER AGENTS: GETTING THE DATA
TRAPPING NETWORK RESPONSESfunc (b *Browser) StartIntercepting() error { b.tab.GetNetworkTraffic(nil, b.responseHandler(), b.respFinishedHandler()) }
func (b *Browser) responseHandler() autogcd.NetworkResponseHandlerFunc { return func(tab *autogcd.Tab, response *autogcd.NetworkResponse) { creeperResponse.Url = response.Response.Url b.networkContainer.WaitFor(response.RequestId) creeperResponse.ResponseBody, _ = b.encodeBody(response.RequestId, creeperResponse.MimeType, creeperResponse.Url) b.networkContainer.AddReady(creeperResponse) }}
// mark the body as readyfunc (b *Browser) respFinishedHandler() autogcd.NetworkFinishedHandlerFunc { return func(tab *autogcd.Tab, requestId string, dataLength, timeStamp float64) { b.networkContainer.BodyReady(requestId) }}
1
2
3
4
VERACODE
CREEPER AGENTS: GETTING THE DATA
INJECTING JAVASCRIPT
▸ Extract JS libraries and versions
▸ Retire.js and Wappalyzer have some good pointers
▸ Created a JSON file with 86 frameworks
▸ Must wait for the page to be fully loaded
VERACODE
CREEPER AGENTS: GETTING THE DATA
INJECTING JAVASCRIPT - THE QUERIES{ "libraries": [ { "url": "http://jquery.com/", "key": "jquery", "statement": "jQuery.fn.jquery" }, { "url": "https://jquerymobile.com/", "key": "jquery-mobile", "statement": "jQuery.mobile.version" }, { "url": "http://www.embeddedjs.com/", "key": "embeddedjs 1.0", "statement": "(typeof EJS === \"function\" && typeof EJS.Buffer === \"function\") ? \"ejs 1.0\":"\"" }, { "url": "http://www.embeddedjs.com/", "key": "embeddedjs 0.x", "statement": "(typeof EJS === \"function\" && typeof EjsScanner === \"function\") ? \"ejs 0.x\":\"\"" } ]}
VERACODE
CREEPER AGENTS: GETTING THE DATA
INJECTING JAVASCRIPT - INJECTING
for _, library := range JsLibs.Libraries { res, err := b.ExecuteScript(library.Statement) if err == nil && string(res) != "" { log.Printf("%s library result was: %s\n", library.Key, string(res)) report.JavaScriptLibraries[library.Key] = string(res) }}
VERACODE
CREEPER AGENTS: GETTING THE DATA
INJECTING JAVASCRIPT - WHEN IS A PAGE DONE?
▸ DOMContentLoaded doesn’t handle dynamically loaded JS
▸ Listen for DOM change events
▸ Page loaded if no DOM change events occur for > 2 seconds
▸ Timeout after 5 seconds
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CONTAMINATION
+ + + + | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | + + + +
StartCapture
LoadURL
DocumentLoaded
StopCapture
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CONTAMINATION - SOLUTION
+ + + + + + + | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | + + + + + + +
BorrowBrowser
StartCapture
LoadURL
DocumentLoaded
StopCapture
KillBrowser
Start/AddPool
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CHROME BUG #1
▸ Turns out opening tabs excessively can cause tabs to not respond to debugger protocol
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CHROME BUG #1 - SOLUTION
▸ Mark tabs as ‘dead’
▸ If max dead tab count is reached, drain active URLs and kill chrome
VERACODE
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CHROME BUG #2 - CHRASHSAFARI.COM
▸ Would completely kill chrome *and* agent
▸ Lost all active tabs
▸ This site cost me about 2-3 weeks development time
VERACODE
▸ Created killface package
▸ Sends a notification to stop active work
▸ Worker count dynamically adjusted to 1
▸ Pauses queue, runs all unfinished URLs again
▸ Once active count is 0, restart normally
CREEPER AGENTS: GETTING THE DATA
CHALLENGES - CRASHSAFARI.COM - SOLUTION
VERACODE
CREEPER AGENTS: GETTING THE DATA
OTHER CHALLENGES
✘ NSQ messages too large, zipping ineffective
✓Split response data/report data
✘ Sites block AWS IP ranges, (craigslist.com etc)
☹ Timeout…
✘ Concurrency issues
✓ Very careful use of go routines, channels and timers.
✘ Site analysis failures/timeouts
✓ Try 3 times, keep track of retry state.
✓ During retry, open a new browser and work on additional url
VERACODE
DB WRITERS: STORING THE DATA
PREVIOUSLY…
▸ Creeper v0 had many problems
▸ RDS did not support PostgreSQL 9.5
▸ Duplicate data
▸ For v1, wrote to disk, SHA1 of contents:
▸ /job/files/5/a/b/c/5abcfbe73e39e0572a939b09f1eb16d7.html
▸ v1 did not shard database tables
▸ Database tables were normalized
▸ Lock contention
VERACODE
DB WRITERS: STORING THE DATA
DATABASE REFRESHER - NORMALIZING
url header_name header_valuehttp://veracode.com x-xss-protection 1; mode=blockhttp://codeblue.jp x-xss-protection 1; mode=blockhttp://google.jp x-xss-protection 1; mode=block report-uri
…
url header_name_id header_value_id
http://veracode.com 0 0
http://codeblue.jp 0 0
http://google.jp 0 1
header_name_id header_name
0 x-xss-protection
header_value_id header_value
0 1; mode=block1 1; mode=block report-uri …
NORMALIZED:
FLATTENED:
VERACODE
DB WRITERS: STORING THE DATA
CHALLENGES - GETTING THE DATA IN QUICKLY
▸ Get the data out of the DB writers as soon as possible
▸ Careful to not overload the database with many connections
▸ Reduce lock contention for writing
VERACODE
DB WRITERS: STORING THE DATA
SOLUTION #1 - GETTING THE DATA IN QUICKLY
▸ DB Writers batch up reports and responses
▸ Inserted every 2.5-3.5 seconds
▸ Reduces number of required DB connections
VERACODE
DB WRITERS: STORING THE DATA
SOLUTION #1 BATCHERfunc (b *Batcher) AddReport(r *creeper_types.CreeperReport) { select { case b.reportPool <- r: atomic.AddInt32(&b.reportCount, 1) }}
func (b *Batcher) EmptyReports() []*creeper_types.CreeperReport { reports := make([]*creeper_types.CreeperReport, 0) for { select { case report := <-b.reportPool: reports = append(reports, report) default: return reports } } return nil}
VERACODE
DB WRITERS: STORING THE DATA
SOLUTION #2 - GETTING THE DATA IN QUICKLY
▸ Insert into temporary table using COPY FROM
▸ Extracted from temporary table and INSERTed into final table. This allows for UPSERTS:
INSERT INTO header_names (header_name) SELECT responses_tmp.header_name FROM responses_tmp ON CONFLICT DO NOTHING;
VERACODE
DB WRITERS: STORING THE DATA
CHALLENGES - LARGE TABLES
▸ INSERT INTO … FROM SELECT … on a table with 80,000,000 rows
▸ As tables got bigger, db writers slowed down
▸ This is not scalable
VERACODE
DB WRITERS: STORING THE DATA
SOLUTION - TABLE SHARDING
▸ Much like sharding for the file system
▸ Requires a key:
▸ URL ID. (Ex: 1,google.com 2,microsoft.com etc)
▸ Only large tables require sharding
VERACODE
shardKey % inputId
shardKey = 1
shardKey = 2
shardKey = 3
DB
DB WRITERS: STORING THE DATA
TABLE SHARDING
WRITER
VERACODE
DB WRITERS: STORING THE DATA
CREATING A SHARD KEY
▸ Choose the number of times to shard your tables: ▸ shardKey = input_id % 32
▸ Created PLpgSQL functions: ▸
create unlogged table if not exists job_0_responses ( response_id serial primary key, input_id integer not null, body_hash varchar(64) not null, resp_url bytea not null, resp_uuid varchar(64) unique not null, resp_type_id integer references resp_types (resp_type_id) not null, status_id integer references status_lines (status_id) not null, status_code integer, mime_type_id integer references mime_types (mime_type_id) not null, response_time bigint);
EXECUTE merge_headers(job, shardKey)
VERACODE
DB WRITERS: STORING THE DATA
CONS WITH SHARDING
▸ Added complexity for querying
▸ Best to create a new table with all data for reporting
▸ In the future, may use Citus for sharding across multiple databases
VERACODE
▸ S3 limits 100/rps, but pushing 200-2000/rps
▸ Had to contact support
▸ Exponential Backoff, retry 10 times
▸ Hash is stored in response table
▸ HeadObject first to check existence, then PutObject
▸ HeadObjects are way cheaper
DB WRITERS: STORING THE DATA
MOVING TO S3
VERACODE
DB WRITERS: STORING THE DATA
LASTLY…
▸ Created unlogged tables
▸ Modified PostgreSQL configuration:
▸ Set checkpoints 5 minutes (max) instead of 1
▸ Enabled fsync
▸ Set max_wal_size 256
VERACODE
THE RESULTS: A LOOK AT DATA
SCAN STATISTICS
Responses 72,193,155
Headers 525,385,900
JS Results 1,943,925
URLs w/Errors 67,315
Redirected to HTTPS 145,268
URLS w/CSP Violations 740
Scan Time 15 Hours
Cost 343$ / 35063円
VERACODE
THE RESULTS: A LOOK AT DATA
CSP VIOLATIONS
▸ 722 out of 4965 sites using CSP had violations
▸ Security sites:
▸ https://www.globalsign.com/en/, http://secunia.com/,
▸ https://lastpass.com/, https://www.avant.com/, http://www.veracode.com/
▸ Well known organizations:
▸ http://www.alibaba.com, https://www.doubleclickbygoogle.com
▸ https://mozillians.org/en-US/
VERACODE
THE RESULTS: A LOOK AT DATA
SUM OF CSP VIOLATION TYPES
0
750
1500
2250
3000
SCRIPTSRCIMGSRC
FRAMESRC
FONTSRC
STYLESRC
CONNECTSRC
MEDIASRC
CHILDSRC
OBJECTSRC
BASEURI
FORMACTION
MANIFESTSRC
VERACODE
THE RESULTS: A LOOK AT DATA
TOP JAVASCRIPT LIBRARIES > 3000
0
200000
400000
600000
800000
JQUE
RY
JQUE
RY-U
I
MODE
RNIZR
JQUE
RY-U
I-DIA
LOG
YEPN
OPE
JQUE
RY-U
I-AUT
OCOM
PLET
E
JQUE
RY-U
I-TOO
LTIP
BOOT
STRA
P
HTML
5SHI
V
UNDE
RSCO
RE
JQUE
RY.PR
ETTY
PHOT
O
PROT
OTYP
EJS
DRUP
AL
MOOT
OOLS
MEJS
BACK
BONE
.JS
ANGU
LARJ
S
FOUN
DATIO
N
JWPL
AYER
REQU
IREJ
S
HAND
LEBA
RS.JS
HAMM
ERJS
JPLA
YER
MUST
ACHE
.JS
SCRI
PTAC
ULOU
S
SHAD
OWBO
X
ZERO
CLIP
BOAR
D YUI
RAPH
AEL
DATA
TABL
ES
KNOC
KOUT
VERACODE
THE RESULTS: A LOOK AT DATA
JAVASCRIPT ‘NEXTGEN’ FRAMEWORKS > 100
0
4500
9000
13500
18000
BACKBONE.JS
ANGULARJS
FOUNDATION YUI
KNOCKOUTDOJO
REACTJS
MARIONETTEJS VUEJS
EMBER
METEOR
MITHRIL
EXTJS
POLYMER
VERACODE
THE RESULTS: A LOOK AT DATA
VULNERABILITY COUNTS
0
20000
40000
60000
80000
JQUE
RY
JQUE
RY-U
I-DIA
LOG
JQUE
RY.PR
ETTY
PHOT
O
ANGU
LARJ
S
JQUE
RY-U
I-TOO
LTIP
JPLA
YER
HAND
LEBA
RS.JS
ZERO
CLIP
BOAR
D
MUST
ACHE
.JS YUI
PROT
OTYP
EJS
MEJS
JWPL
AYER
DOJO
EMBE
R
TINYM
CE
PLUP
LOAD
JQUE
RY-M
OBILE
CKED
ITOR
VERACODE
THE RESULTS: A LOOK AT DATA
LONGEST SECURITY HEADER AWARD - HTTPS://WWW.INSIGHTGUIDES.COM/Content-Security-Policy: default-src 'self' http://tagmanager.google.com https://tagmanager.google.com https://*.doubleclick.net http://*.doubleclick.net https://*.google-analytics.com http://*.google-analytics.com https://*.livechatinc.com http://*.livechatinc.com https://*.cloudfront.net http://*.cloudfront.net https://*.googleusercontent.com http://*.googleusercontent.com https://www.bugherd.com http://www.bugherd.com https://*.braintreegateway.com http://*.braintreegateway.com https://www.biblioimages.com http://www.biblioimages.com https://fonts.gstatic.com http://fonts.gstatic.com https://*.googleapis.com http://*.googleapis.com https://tripadvisor.com http://tripadvisor.com https://*.gstatic.com http://*.gstatic.com https://www.tripadvisor.com http://www.tripadvisor.com https://www.insightguides.com http://www.insightguides.com https://rum-static.pingdom.net http://rum-static.pingdom.net https://rum-collector.pingdom.net http://rum-collector.pingdom.net https://*.youtube.com http://*.youtube.com https://www.googleadservices.com http://www.googleadservices.com https://connect.facebook.net http://connect.facebook.net https://googleads.g.doubleclick.net http://googleads.g.doubleclick.net https://www.facebook.com http://www.facebook.com https://cdn.inspectlet.com http://cdn.inspectlet.com https://hn.inspectlet.com http://hn.inspectlet.com https://*.apa.yoda.site http://*.apa.yoda.site https://www.preprod.apa.yoda.site http://www.preprod.apa.yoda.site https://www.test.apa.yoda.site http://www.test.apa.yoda.site https://www.google.com http://www.google.com https://www.google.pl http://www.google.pl https://www.google.co.uk http://www.google.co.uk https://google.com http://google.com https://google.pl http://google.pl https://google.co.uk http://google.co.uk https://ethn.io http://ethn.io https://stats.g.doubleclick.net http://stats.g.doubleclick.net https://platform.instagram.com http://platform.instagram.com https://instagram.com http://instagram.com https://www.instagram.com http://www.instagram.com https://*.amazonaws.com http://*.amazonaws.com blob:; script-src 'self' http://www.googletagmanager.com https://www.googletagmanager.com http://tagmanager.google.com https://tagmanager.google.com https://*.doubleclick.net http://*.doubleclick.net https://*.google-analytics.com http://*.google-analytics.com https://*.livechatinc.com http://*.livechatinc.com https://*.cloudfront.net http://*.cloudfront.net https://*.googleusercontent.com http://*.googleusercontent.com https://www.bugherd.com http://www.bugherd.com https://*.braintreegateway.com http://*.braintreegateway.com https://www.biblioimages.com http://www.biblioimages.com https://fonts.gstatic.com http://fonts.gstatic.com https://*.googleapis.com http://*.googleapis.com https://tripadvisor.com http://tripadvisor.com https://*.gstatic.com http://*.gstatic.com https://www.tripadvisor.com http://www.tripadvisor.com https://www.insightguides.com http://www.insightguides.com https://rum-static.pingdom.net http://rum-static.pingdom.net https://rum-collector.pingdom.net http://rum-collector.pingdom.net https://*.youtube.com http://*.youtube.com https://www.googleadservices.com http://www.googleadservices.com https://connect.facebook.net http://connect.facebook.net https://googleads.g.doubleclick.net http://googleads.g.doubleclick.net https://www.facebook.com http://www.facebook.com https://cdn.inspectlet.com http://cdn.inspectlet.com https://hn.inspectlet.com http://hn.inspectlet.com https://*.apa.yoda.site http://*.apa.yoda.site https://www.preprod.apa.yoda.site http://www.preprod.apa.yoda.site https://www.test.apa.yoda.site http://www.test.apa.yoda.site https://www.google.com http://www.google.com https://www.google.pl http://www.google.pl https://www.google.co.uk http://www.google.co.uk https://google.com http://google.com https://google.pl http://google.pl https://google.co.uk http://google.co.uk https://ethn.io http://ethn.io https://stats.g.doubleclick.net http://stats.g.doubleclick.net https://platform.instagram.com http://platform.instagram.com https://instagram.com http://instagram.com https://www.instagram.com http://www.instagram.com https://*.amazonaws.com http://*.amazonaws.com 'unsafe-eval' 'unsafe-inline' https://apis.google.com blob:; connect-src * 'self' http://tagmanager.google.com https://tagmanager.google.com https://*.doubleclick.net http://*.doubleclick.net https://*.google-analytics.com http://*.google-analytics.com https://*.livechatinc.com http://*.livechatinc.com https://*.cloudfront.net http://*.cloudfront.net https://*.googleusercontent.com http://*.googleusercontent.com https://www.bugherd.com http://www.bugherd.com https://*.braintreegateway.com http://*.braintreegateway.com https://www.biblioimages.com http://www.biblioimages.com https://fonts.gstatic.com http://fonts.gstatic.com https://*.googleapis.com http://*.googleapis.com https://tripadvisor.com http://tripadvisor.com https://*.gstatic.com http://*.gstatic.com https://www.tripadvisor.com http://www.tripadvisor.com https://www.insightguides.com http://www.insightguides.com https://rum-static.pingdom.net http://rum-static.pingdom.net https://rum-collector.pingdom.net http://rum-collector.pingdom.net https://*.youtube.com http://*.youtube.com https://www.googleadservices.com http://www.googleadservices.com https://connect.facebook.net http://connect.facebook.net https://googleads.g.doubleclick.net http://googleads.g.doubleclick.net https://www.facebook.com http://www.facebook.com https://cdn.inspectlet.com http://cdn.inspectlet.com https://hn.inspectlet.com http://hn.inspectlet.com https://*.apa.yoda.site http://*.apa.yoda.site https://www.preprod.apa.yoda.site http://www.preprod.apa.yoda.site https://www.test.apa.yoda.site http://www.test.apa.yoda.site https://www.google.com http://www.google.com https://www.google.pl http://www.google.pl https://www.google.co.uk http://www.google.co.uk https://google.com http://google.com https://google.pl http://google.pl https://google.co.uk http://google.co.uk https://ethn.io http://ethn.io https://stats.g.doubleclick.net http://stats.g.doubleclick.net https://platform.instagram.com http://platform.instagram.com https://instagram.com http://instagram.com https://www.instagram.com http://www.instagram.com https://*.amazonaws.com http://*.amazonaws.com blob:; img-src data: 'self' http://tagmanager.google.com https://tagmanager.google.com https://*.doubleclick.net http://*.doubleclick.net https://*.google-analytics.com http://*.google-
VERACODE
THE RESULTS: A LOOK AT DATA
SOME OF MY FAVORITE HTTP STATUS LINES
▸ HTTP 500 access denied ("java.io.FilePermission" "D:\home\XXXXXXXXX.com\ori\ModelGlue\unity\eventrequest\EventRequest.cfc" "read")
▸ HTTP 500 "Duplicate entry '1473335051' for key 'timestamp' SQL=INSERT INTO `#__zt_visitor_counter` (`id`,`timestamp`,`visits`,`guests`,`ipaddress`,`useragent`) VALUES (null, '1473335051', 1 , 1 , '54.208.81.16', ‘chrome')"
▸ HTTP 500 "Server Made Big Boo"
VERACODE
THE RESULTS: A LOOK AT DATA
CONCLUSION
▸ Use NSQ, seriously.
▸ Concurrency can be difficult
▸ Batch data before inserting to DB
▸ If DB rows > a few million, consider sharding
▸ Test different types of table schema for performance
▸ Treat browsers like garbage and handle appropriately
VERACODE
THE RESULTS: A LOOK AT DATA
QUESTIONS?
▸ twitter: @_wirepair
▸ github: wirepair
▸ gcd: https://github.com/wirepair/gcd
▸ autogcd: https://github.com/wirepair/autogcd
▸ killface: https://github.com/wirepair/killface
▸ Thanks to all my coworkers supporting and listening to my daily rants!