planning for high performance web application
DESCRIPTION
This slide is prepared for Beijing Open Party (a monthly unconference in Beijing China). And it's covered some important points when you are building a scalable web sites. And few page of this slide is in Chinese.TRANSCRIPT
Planning for Planning for performanceperformanceFor web developer, open discussionFor web developer, open discussion
Tin@Beijing Open PartyTin@Beijing Open Party
师不必强于己师不必强于己己不必不如师己不必不如师
AgendaAgenda
Basic programming practiceBasic programming practice
Hardware platformHardware platform
Software platformSoftware platform
System essentialsSystem essentials
OptimizationsOptimizations
Load BalancingLoad Balancing
Basic practicesBasic practices
Use proper SCMUse proper SCM
CVSCVS
SVNSVN
MercurialMercurial
GitGit
Basic practicesBasic practices
Use a auto-build systemUse a auto-build system
Shell scripsShell scrips
MakeMake
Ant, NantAnt, Nant
RakeRake
Basic practicesBasic practicesUse a Continues Integration toolUse a Continues Integration tool
So first you need a lot of testsSo first you need a lot of tests
Add auto test, compile job as daily taskAdd auto test, compile job as daily task
Use CI tools to monitor health of your code Use CI tools to monitor health of your code basebase
CruiseControl, Luntbuild, Continnum, HudsonCruiseControl, Luntbuild, Continnum, Hudson
Cruise, Teamcity, BanbooCruise, Teamcity, Banboo
Use cc-tray, cc-menu desktop widgetUse cc-tray, cc-menu desktop widget
Basic practicesBasic practices
Use a issue trackerUse a issue tracker
Trac (only svn)Trac (only svn)
Bugzilla, Mantis Bug TrackerBugzilla, Mantis Bug Tracker
JiraJira
MingleMingle
BugFreeBugFree
Voice from twitterVoice from twitter
一定要测试!一定要早点测试!一定要早点测试!否一定要测试!一定要早点测试!一定要早点测试!否则你就死定了。则你就死定了。
对任何部分都要测试。对任何部分都要测试。
性能测试要交给用户来做。那样才有意义。所以要做性能测试要交给用户来做。那样才有意义。所以要做好好 loglog。。
Basic practicesBasic practices
Lifecycle controlLifecycle control
Develop -> Test -> DeployDevelop -> Test -> Deploy
Release managementRelease management
Trunk, Branch, TagTrunk, Branch, Tag
Milestone, Release candicateMilestone, Release candicate
Basic practicesBasic practices
Use Agile methodologiesUse Agile methodologies
XP practicesXP practices
TDDTDD
Pair programmingPair programming
ScrumScrum
Hybrid agileHybrid agile
Hardware platformHardware platform
Use economical hardwareUse economical hardware
CPU and MemoryCPU and Memory
Disk and disk I/O (Raid)Disk and disk I/O (Raid)
NICNIC
Power and fanPower and fan
1U 2U 3U 4U ?1U 2U 3U 4U ?
Hardware platformHardware platformBrandBrand
Dell, IBM, HP, Lenovo, Asus?Dell, IBM, HP, Lenovo, Asus?
Service qualityService quality
Hardware redundancyHardware redundancy
Part redundancyPart redundancy
Availability and Lead Time (critical parts)Availability and Lead Time (critical parts)
Capacity redundancyCapacity redundancy
Future plan?Future plan?
Network & hostingNetwork & hostingVPS, VPS, 虚拟主机虚拟主机
Co-Located Hardware (colo), Co-Located Hardware (colo), 主机托管主机托管
Bandwidth, Duel lines, air-conditionBandwidth, Duel lines, air-condition
Geo-locationGeo-location
Self-HostingSelf-Hosting
How to choose network hardware How to choose network hardware (switch/router)?(switch/router)?
Cisco, Huaway, FoundryCisco, Huaway, Foundry
Software platformSoftware platform
Use pre-compiled OS and softwareUse pre-compiled OS and software
Choose a OSChoose a OS
CentOS, Redhat, SuseCentOS, Redhat, Suse
FreebsdFreebsd
SolarisSolaris
no ubuntu server (from nicholas ding)no ubuntu server (from nicholas ding)
Software platformSoftware platform
Choose a language (scriptiing language is Choose a language (scriptiing language is better)better)
PHPPHP
PythonPython
PerlPerl
RubyRuby
JavaJava
Many many many... but not c...Many many many... but not c...
Software platformSoftware platform
Choose a database ( or data provider)Choose a database ( or data provider)
MysqlMysql
PosgresqlPosgresql
Big table implementation?Big table implementation?
Now, let’s goNow, let’s go
System essentials System essentials
Web serverWeb server
ApacheApache
LighthttpdLighthttpd
NginxNginx
Tux, Cherokee, LightspeedTux, Cherokee, Lightspeed
Tomcat, JettyTomcat, Jetty
Mongrel, ThinMongrel, Thin
System essentials System essentials
Different deployment style (python/ruby)Different deployment style (python/ruby)
Apache + mod_python (mod_rails, Apache + mod_python (mod_rails, passenger)passenger)
Fastcgi, SCGI, CGIFastcgi, SCGI, CGI
Proxy (Load balancing) + Multi-server Proxy (Load balancing) + Multi-server instanceinstance
thread? process?thread? process?
System essentialsSystem essentials
Monitoring your systemMonitoring your system
web server logsweb server logs
Webalizer, Report MagicWebalizer, Report Magic
Beacon (seperate static file server tracker)Beacon (seperate static file server tracker)
error log analysiserror log analysis
AWStats & Google AnalyticsAWStats & Google Analytics
System essentialsSystem essentialsMonitoring your systemMonitoring your system
Monit (RubyWorks use runit)Monit (RubyWorks use runit)
Monitoring process statusMonitoring process status
Auto restart your important processAuto restart your important process
Better than cron for monitoringBetter than cron for monitoring
Munin & NagiosMunin & Nagios
Distributed monitoring all of your systemDistributed monitoring all of your system
Administrator’s eyes, developers friendsAdministrator’s eyes, developers friends
System essentialsSystem essentialsMunin & Nagios continuesMunin & Nagios continues
Munin has server and nodes, it generate sites Munin has server and nodes, it generate sites to report the statistics of your server (in to report the statistics of your server (in interval)interval)
Munin and Nagios and integrateMunin and Nagios and integrate
Mem usage, CPU, process, disk usageMem usage, CPU, process, disk usage
Service: HTTP, SMTP, POP3, NNTP, PingService: HTTP, SMTP, POP3, NNTP, Ping
Hardware temperature and other datasHardware temperature and other datas
Network statisticsNetwork statistics
Custom scrips (plugins): db related, user Custom scrips (plugins): db related, user numbernumber
System essentialsSystem essentialsProtect your system (Protect your system (Management is important than Management is important than
toolstools))
SSH brute attack protectionSSH brute attack protection
ssh key loginssh key login
blockhost (scripts + pf/iptables)blockhost (scripts + pf/iptables)
Audit: SELinux...Audit: SELinux...
Firewall (port block and audit) Firewall (port block and audit)
Use safe OS? (Netbsd, freebsd)Use safe OS? (Netbsd, freebsd)
Network safety (but no hardware firewall for Network safety (but no hardware firewall for websites)websites)
System essentials System essentials SNA (Share Nothing Architecture) (This is relative SNA (Share Nothing Architecture) (This is relative term)term)
All static file and rsyncAll static file and rsync
Database centric SNADatabase centric SNA
Memcached + db-persistenceMemcached + db-persistence
Server hash, cluster, partitionServer hash, cluster, partition
Amazon/Blogger/Cragslist/Facebook/Google/Amazon/Blogger/Cragslist/Facebook/Google/LiveJournal/Slashdot/Wikipedia/Yahoo/YouTubeLiveJournal/Slashdot/Wikipedia/Yahoo/YouTube
Session stickySession sticky
System essentialsSystem essentials
Make your modules independentMake your modules independent
Layers, packagesLayers, packages
Easy to replace moduleEasy to replace module
Easy to deployEasy to deploy
Easy to profile and make improvesEasy to profile and make improves
OptimizationsOptimizationsSplit your static content and dynamic content Split your static content and dynamic content serverserver
Use lightweight web server to server static Use lightweight web server to server static contentscontents
Use different domain to different serverUse different domain to different server
CachingCaching
MemcachedMemcached
Query result, domain objects, sessionsQuery result, domain objects, sessions
Page tiles, template tilesPage tiles, template tiles
Everything that you needEverything that you need
OptimizationsOptimizationsCachingCaching
Optimize your code (lazy evaluate, cache result)Optimize your code (lazy evaluate, cache result)
Cache and asynchronous update (cron update)Cache and asynchronous update (cron update)
目标,命中率目标,命中率 90%90%以上!以上! Target 90%+Target 90%+
But cache invalidation is a critical problem!But cache invalidation is a critical problem!
Asynchronous messaging make sure cache Asynchronous messaging make sure cache validatevalidate
No blocking!No blocking!
ActiveMQ, RabbitMQ, Drb (for ruby)ActiveMQ, RabbitMQ, Drb (for ruby)
OptimizationsOptimizationsCachingCaching
Better client side cachingBetter client side caching
Use expired header: max-age, expiredUse expired header: max-age, expired
E-tag? (Not recommended, IE doesn’t support E-tag? (Not recommended, IE doesn’t support it)it)
Use HEAD method and 301 to detect changes Use HEAD method and 301 to detect changes (for squid or other proxy scenarios)(for squid or other proxy scenarios)
Compress (contact js, css)Compress (contact js, css)
OptimizationsOptimizationsSQL optimizationsSQL optimizations
Add index (especially the column in where Add index (especially the column in where closure)closure)
De-normalized SQLDe-normalized SQL
Useful redundancy (use duplication avoid join)Useful redundancy (use duplication avoid join)
Don’t relay on ORM. No matter Don’t relay on ORM. No matter Data-mapper/Active Record/Unit Of WorkData-mapper/Active Record/Unit Of Work
Don’t use full-text searchDon’t use full-text search
Use seperate search engine module (lucene)Use seperate search engine module (lucene)
OptimizationsOptimizations
Choose proper database store engineChoose proper database store engine
Mysql: MyISAM? InnoDB? BDB? Heap?Mysql: MyISAM? InnoDB? BDB? Heap?
AcceleratorAccelerator
PHP: APC, Zend Optimizer, XCache, PHP: APC, Zend Optimizer, XCache, eAccelerator, ionCube PHP Accelerator, Turck eAccelerator, ionCube PHP Accelerator, Turck MMCacheMMCache
Python: psycoPython: psyco
Ruby: Joyent acceleratorRuby: Joyent accelerator
But most important thing:But most important thing:
Find out the bottle neck Find out the bottle neck before you start to before you start to optimize your optimize your application.application.
Next,Next, Scaling, Scaling,
If time is enoughIf time is enough
What is scaling?What is scaling?
Three basics, Three basics, 简单特性简单特性 ::
能够使用率的提高能够使用率的提高 , Useable capacity increasing, Useable capacity increasing
能够容纳数据集提高,能够容纳数据集提高, Data capacity increasingData capacity increasing
系统可维护,系统可维护, MaintainableMaintainable
Scaling, 2 waysScaling, 2 waysVertical ScalingVertical Scaling
Upgrade your hardware systemUpgrade your hardware system
More CPU, memory ....More CPU, memory ....
Horizontal ScalingHorizontal Scaling
Buy more same hardware, deploy more server Buy more same hardware, deploy more server instanceinstance
Distributed your systemDistributed your system
But this way need you modify your code But this way need you modify your code (generally)(generally)
Scaling-Load BalancingScaling-Load BalancingDNS-GSLBDNS-GSLB
Use DNS’s round-robin algorithm randomize IP Use DNS’s round-robin algorithm randomize IP resultresult
xBayDNSxBayDNS
Can’t deal with failure (TTL)Can’t deal with failure (TTL)
Hard to do accurate managementHard to do accurate management
CDN content delivery networkCDN content delivery network
transparent service provide by some companytransparent service provide by some company
expansive, and not suitable for dynamic content expansive, and not suitable for dynamic content
Scaling-Load BalancingScaling-Load Balancing
Hardware LBHardware LB
Citrix: Netscalers, Foundry: ServerIron, F5 (4-Citrix: Netscalers, Foundry: ServerIron, F5 (4-7)7)
ExpensiveExpensive
Software LBSoftware LB
Perlbal (4), Pound (7)Perlbal (4), Pound (7)
LVS (4)LVS (4)
Scaling-Load BalancingScaling-Load BalancingLayer2, Layer4 and Layer7 LBLayer2, Layer4 and Layer7 LB
Layer 2: Link aggregation, provide Layer 2: Link aggregation, provide redundancy and fault tolerance, improve redundancy and fault tolerance, improve access speedaccess speed
Layer 4: round-robin on TCP (with port info)Layer 4: round-robin on TCP (with port info)
Layer 7Layer 7
Session sticky enalbedSession sticky enalbed
Easy to write complicate hash logicEasy to write complicate hash logic
Good for Squid (Squid cluster enabled)Good for Squid (Squid cluster enabled)
Scaling-Load BalancingScaling-Load BalancingHuge Scale LBHuge Scale LB
GSLB -> DNS round robinGSLB -> DNS round robin
Virtual IP -> L4 or L7 LB (SNAT)Virtual IP -> L4 or L7 LB (SNAT)
ExampleExample
Level 1 LB use GSLB give geo-located DNS Level 1 LB use GSLB give geo-located DNS resultresult
VIP is dispatched by F5VIP is dispatched by F5
F5 -> Squid, reverse proxyF5 -> Squid, reverse proxy
Squid delegate real dynamic or static serverSquid delegate real dynamic or static server
Scaling-Proxy CacheScaling-Proxy Cache
Reverse proxyReverse proxy
SquidSquid
Use http head method to validate contentUse http head method to validate content
Use memory to cache content - light speedUse memory to cache content - light speed
Mature, fast, industry standardMature, fast, industry standard
Scaling-DatabaseScaling-Database
Scaling MySQLScaling MySQL
MySQL replication/duplication (Failure, Lag)MySQL replication/duplication (Failure, Lag)
Master/SlaveMaster/Slave
Tree replicationTree replication
Data partitionData partition
MySQL proxyMySQL proxy
Data shardData shard
Scaling-File SystemScaling-File SystemSingle Disk (Array)Single Disk (Array)
Raid 1, Raid 0, Raid5Raid 1, Raid 0, Raid5
Partition table type (GPT, MBR)Partition table type (GPT, MBR)
Partition Format (ext2, ext3, resierfs, XFS, ZFS)Partition Format (ext2, ext3, resierfs, XFS, ZFS)
ClusterCluster
Single Disk has limitation, but Cluster has no Single Disk has limitation, but Cluster has no limitlimit
NetApp Filer (NAS - Network-attached storage)NetApp Filer (NAS - Network-attached storage)
Many many choicesMany many choices
Scaling-File System Scaling-File System SharingSharing
Hardware based sharing NAS (previous page)Hardware based sharing NAS (previous page)
NFS - most simple way to share FSNFS - most simple way to share FS
Samba - almost same with NFS, nice to trySamba - almost same with NFS, nice to try
MogileFS (for web, no cursor based random MogileFS (for web, no cursor based random access)access)
GFS, Hadoop FS (chunk based)GFS, Hadoop FS (chunk based)
We are coming a long way, We are coming a long way, babybaby
Thanks!