An overview of the similarities and differences between Site Reliability Engineering and Platform Engineering, including from a career perspective. | Continue reading
A comparison of the two main SRE team models: Embedded SREs vs. standalone SRE teams. | Continue reading
Best practices for “SRE pioneers” – meaning engineers who are the very first SREs hired at an organization. | Continue reading
Our co-founder JJ reflects on building the fastest-growing incident management platform and the surprising learnings. | Continue reading
Incident severity levels are a measurement of the impact an incident has on the business. Classifying the severity of an issue is critical to decide how quickly and efficiently problems get resolved. | Continue reading
Does it always make sense to stick to your playbooks? There’s no clear answer, but it’s still something you should think about. | Continue reading
An explanation of the meaning of SLA, SLO and SLI, and how SREs should use each concept to manage reliability. | Continue reading
A comparison of EKS, AKS, GKE, Rancher and OpenShift from an SRE’s perspective. | Continue reading