ალექსანდრე ნემსაძე - release it
DESCRIPTION
ალექსანდრე ნემსაძე - Release it სტუდენტური ჰაკათონი - http://macs.unihack.ge/ video: http://www.youtube.com/watch?v=bfcvJF7qCnw&feature=share&list=UUKyScn6UfDhuQ7UINKhyEfQ&index=2TRANSCRIPT
Release it!Design and Deploy Production-Ready
Software
Alexander NemsadzeSoftware Developer
“Pragmatic Programmer” defined
Typical Development Life Cycle
Conception
Initiation
Analysis/Req.Def.
Design
Development
Integ. Testing / QA
Deployment
Maintenance
Quality of Life
Quality of Life
Release 1.0 is the beginning of your software’s life, not the end of the project.
Your quality of life after Release 1.0 depends on choices you make long before that vital milestone.
Quality of life degraded, what now? Get fired!
Gain experience, learn lessons!
To stay happy and self confident after release 1.0, you must remember that you are developing software for production use from the very beginning.
Stay happy
Dev
• Frequent server restarts
• Professional users
• Favorite browser
• No concurrent load on server
• Emulated integration points
• No financial risk in case of malfunction
Production
• Server is up for months
• All kind of users
• Different browsers (at least that you have promised to support)
• Real load
• Real integration with external services
• Negative financial impact
• No assumptions that something won’t happen, if there is a tiny chance it will definitely happen
Environment differences
Design Production-Ready System Choosing an appropriate technology Performance and capacity analysis Development team culture Architectural decisions Effective stress tests Deployment checklist Monitoring and effective incident
investigation
Choosing an appropriate technology Investigate Evaluate Use cases Documentation Community Stability Tooling Storage Experience of your team, learning curve Use force
Your early decisions make the biggest impact on the eventual shape of your system.
The earliest decisions you make can be the hardest ones to reverse later.
Performance and capacity analysis Define capacity against system specificity
Capacity can be maximum concurrent active sessions, in case of public web site, until allowable response time is not exceeded.
Or it can be maximum transactions throughput in case of financial transactions processing system.
Any other thing or combination Memory usage analyses Horizontal scalability Hardware topology Maybe use some cloud vendor, that offers
elastic capacity service Server responsibilities
Development team culture Team members must understand and respect
methodology of project management, whether it’s agile, extreme or whatever
Note every detail Respect coding quality standard Unit tests are mandatory Integration test are desired for critical modules Don’t resolve an issue hoping that someone will test
it for you Worst and most annoying is returning back to issue
that you declared as resolved Make code reviews of each others code Refactor as many times as it is needed for your own
satisfaction looking at your own code. Log effectively Ask for advice, if you are not sure that you make it
right
Architectural decisions
Gather use cases and constraints before thinking solution
There are no straightforward rules or patterns for success. All of this templates together with experience just help to make correct decisions
Do not confuse architecture with tools and frameworks
Design domain model from the performance and reporting requirements point of view
Rethink and refactor before it’s too late Avoid radical decisions when the milestone
is nearing Think pragmatic
Effective stress tests
Try to build realistic environment Compose scenario of real world usage Simulate exceptional situations
Analyze chains of failure Analyze increased memory usage
impacts Analyze overall performance impacts
Simulate integration points malfunctioning
Deployment checklist
Make sure production parameters are set correctly
Check memory usage configuration Check database connection pool size Check thread pools configuration Check timeouts configuration Check logging configuration Check security configurations Other server configurations
Monitoring and effective incident investigationUse monitoring tools and permanently analyze logs. If you noted some suspected behavior, don’t hesitate to react, before it will burst the overall system.
After incident happened don’t immediately destroy facts that would help you to investigate the issue.
If something happened once it will definitely repeat.
Thanks!
Q/A
Alexander NemsadzeSoftware Developer