sre in startup

Post on 21-Jan-2017

75 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SRE in startupZonky 17.1.2017

Ladislav Prskavec, Apiaryladislav@apiary.io

@abtris

1

What is SRE?

2

"What happens when a software engineer is tasked with what used to be called operations."

» Ben Treynor Sloss, Vice President, Google Engineering, founder of Google SRE

3

"Our work is like being part of the world's most intense pit crew. We change the tires of a race car as it's going 100 mph."

» Andrew Widdowson, Site Reliability Engineer, Mountain View

4

In general, an SRE team is responsible for:

» availability

» latency

» performance

» efficiency

» change management

» monitoring

» emergency response

» capacity planning

5

6

If the team agrees on a 99.9% SLA, that gives them an error budget of

0.1%.

7

8

RuleIf service is in SLA, launch away- clearly DEV team is doing a good job

If service is not within SLA, launch freeze- Until you earn back enough error budget

9

Error budget» removes SRE - DEV conflict

» DEV teams make self-police

10

Common staffing pool» one more SRE = one less Dev

11

SRE hires only coders» they get bored easily

» speak same language as Dev

12

50% cap on ops work» if you succeed works scales with traffic

» coding reduce work / traffic ratio

13

Keep Dev in rotation» 5% ops handled by devs

14

Speaking of Dev and Ops work» excess operations load (tickets, oncall, etc.)

15

SRE portability» no requirement to stick with project or SRE

16

Outages» minimalize impact

» prevent recurrence

17

Minimalize damage» no NOC

» good diagnostic information

» practice, practice, practice

18

Prevent recurrence1. Handle event

2. Write post-mortems

3. Reset

19

Post-mortems philosophy» blameless, focus on process and technology

» create timeline

» get all facts

» create bugs for all followup work

20

How are specific SRE in startup?

21

1:10

22

Horizontal team

23

SaaS oriented

24

Oncall culture

25

It's cool work

26

"May the Queries Flow,And the Pagers Remain Silent"

SRE Benediction

28

top related