Site Reliability Engineering and performance testing with Stephen Townshend (k6 Office Hours #32)

What's the difference between Site Reliability Engineering and performance testing? SRE and performance testing/engineering have a lot of overlap. Where does performance testing end and SRE begin? And what does this mean for people in those roles? To discuss this topic, we have three people with varying areas of expertise: Nicole van der Hoeven (k6) is a performance engineer, Stephen Townshend (IAG) has recently changed careers from performance to SRE, and Daniel González Lopes (k6) is an SRE who learned about performance testing when he joined k6.

Error Economics - How to avoid breaking the budget

At SLOConf 2021 I talked about how we may use error budgets to add pass/fail criterias to reliability tests we run as part of our CI pipelines. As Site Reliability Engineers, one of our primary goals is to reduce manual labor, or toil, to a minimum while at the same time keeping the systems we manage as reliable and available as possible. To be able to do this in a safe way, it's really important that we're able to easily inspect the state of the system.

Resilience Is an R&D Problem, Not Just an SRE Problem

Imagine that you’re at your company’s all-hands meeting and one of the sellers is proudly ringing the office gong to celebrate closing a big deal with a client who’s on the other side of the world. It’s a big deal because it’s a major project. Their logo is going to look sleek on your website, and you are finally breaking into a new region of the world. But two months after the project kicks off, the situation isn’t looking as rosy.