ალექსანდრე ნემსაძე - release it

Release it!Design and Deploy Production-Ready

Software

Alexander NemsadzeSoftware Developer

“Pragmatic Programmer” defined

Typical Development Life Cycle

Conception

Initiation

Analysis/Req.Def.

Design

Development

Integ. Testing / QA

Deployment

Maintenance

Quality of Life

Quality of Life

Release 1.0 is the beginning of your software’s life, not the end of the project.

Your quality of life after Release 1.0 depends on choices you make long before that vital milestone.

Quality of life degraded, what now? Get fired!

Gain experience, learn lessons!

To stay happy and self confident after release 1.0, you must remember that you are developing software for production use from the very beginning.

Stay happy

Dev

• Frequent server restarts

• Professional users

• Favorite browser

• No concurrent load on server

• Emulated integration points

• No financial risk in case of malfunction

Production

• Server is up for months

• All kind of users

• Different browsers (at least that you have promised to support)

• Real load

• Real integration with external services

• Negative financial impact

• No assumptions that something won’t happen, if there is a tiny chance it will definitely happen

Environment differences

Design Production-Ready System Choosing an appropriate technology Performance and capacity analysis Development team culture Architectural decisions Effective stress tests Deployment checklist Monitoring and effective incident

investigation

Choosing an appropriate technology Investigate Evaluate Use cases Documentation Community Stability Tooling Storage Experience of your team, learning curve Use force

Your early decisions make the biggest impact on the eventual shape of your system.

The earliest decisions you make can be the hardest ones to reverse later.

Performance and capacity analysis Define capacity against system specificity

Capacity can be maximum concurrent active sessions, in case of public web site, until allowable response time is not exceeded.

Or it can be maximum transactions throughput in case of financial transactions processing system.

Any other thing or combination Memory usage analyses Horizontal scalability Hardware topology Maybe use some cloud vendor, that offers

elastic capacity service Server responsibilities

Development team culture Team members must understand and respect

methodology of project management, whether it’s agile, extreme or whatever

Note every detail Respect coding quality standard Unit tests are mandatory Integration test are desired for critical modules Don’t resolve an issue hoping that someone will test

it for you Worst and most annoying is returning back to issue

that you declared as resolved Make code reviews of each others code Refactor as many times as it is needed for your own

satisfaction looking at your own code. Log effectively Ask for advice, if you are not sure that you make it

right

Architectural decisions

Gather use cases and constraints before thinking solution

There are no straightforward rules or patterns for success. All of this templates together with experience just help to make correct decisions

Do not confuse architecture with tools and frameworks

Design domain model from the performance and reporting requirements point of view

Rethink and refactor before it’s too late Avoid radical decisions when the milestone

is nearing Think pragmatic

Effective stress tests

Try to build realistic environment Compose scenario of real world usage Simulate exceptional situations

Analyze chains of failure Analyze increased memory usage

impacts Analyze overall performance impacts

Simulate integration points malfunctioning

Deployment checklist

Make sure production parameters are set correctly

Check memory usage configuration Check database connection pool size Check thread pools configuration Check timeouts configuration Check logging configuration Check security configurations Other server configurations

Monitoring and effective incident investigationUse monitoring tools and permanently analyze logs. If you noted some suspected behavior, don’t hesitate to react, before it will burst the overall system.

After incident happened don’t immediately destroy facts that would help you to investigate the issue.

If something happened once it will definitely repeat.

Thanks!

Q/A

Alexander NemsadzeSoftware Developer

ალექსანდრე ნემსაძე - release it

Technology