Member-only story
A 5-minute introduction to SLA, SLO, and SLI
This is not the first time we are building critical financial services applications, but these days we are building for high-efficiency ratio and we are building lots of “mini” applications, leveraging distributed and elastic environments.
Whether we call them “Breaking the monolith” or “digital” or “Not to micro-services”, these applications are no longer a single blob.
I usually refer people to this great introductory article, but generally, I get a few more follow-ups.
The theme for SLA, SLO, and SLIs is, you can only improve what can be measured
Historically, we measured SLA with the number of 9s.
Example: 99.9, 99.99, 99.995 (3,4,4.5 9’s environments) and this was a theoretical exercise at best.
I will take a quick detour here to give the math behind these 9’s. We will start with legacy SLAs.
Hypothetically, say, you are going to use an AWS EC2 machine to build a web server and you want to make sure that this servers’ static content be served 99.99% of the time.
ECS SLA documentation says, “AWS will use commercially reasonable efforts to make the Included Services each available for each AWS region with a Monthly Uptime Percentage of at least 99.99%”!