Lydia Leong

A member of the Gartner Blog Network

Lydia Leong
Research VP
11 years at Gartner
19 years IT industry

Lydia Leong is a research vice president in the Technology and Service Providers group at Gartner. Her primary research focus is cloud computing, together with Internet infrastructure services, such as Web hosting, content delivery networks…Read Full Bio

Coverage Areas:

Hope is not engineering

by Lydia Leong  |  June 7, 2010  |  1 Comment

My enterprise clients frequently want to know why fill-in-the-blank-cloud-IaaS only has a 99.95% SLA. “That’s more than four hours of downtime a year!” they cry. “More than twenty minutes a month! I can’t possibly live with that! Why can’t they offer anything better than that?”

The answer to that is simple: There is a significant difference between engineering and hope. Many internal IT organizations, for instance, set service-level objectives that are based on what they hope to achieve, rather than the level that the solution is engineered to achieve, and can be mathematically expected to deliver, based on calculated mean time between failures (MTBF) of each component of the service. Many organizations are lucky enough to achieve service levels that are higher than the engineered reliability of their infrastructure. IaaS providers, however, are likely to base their SLAs on their engineered reliability, not on hope.

If a service provider is telling you the SLA is 99.95%, it usually means they’ve got a reasonable expectation, mathematically, of delivering a level of availability that’s 99.95% or higher.

My enterprise client, with his data center that has a single UPS and no generator (much less dual power feeds, multiple carriers and fiber paths, etc.), with a single, non-HA, non-load-balanced server (which might not even have dual power supplies, dual NICs, etc.), will tell me that he’s managed to have 100% uptime on this application in the past year, so fie on you, Mr. Cloud Provider.

I believe that uptime claim. He’s gotten lucky. (Or maybe he hasn’t gotten lucky, but he’ll tell me that the power outage was an anomaly and won’t happen again, or that incident happened during a maintenance window so it doesn’t count.)

A service provider might be willing to offer you a higher SLA. It’s going to cost you, because once you get past a certain point, mathematically improving your reliability starts to get really, really expensive.

Now, that said, I’m not necessarily a fan of all cloud IaaS providers’ SLAs. But I encourage anyone looking at them (or a traditional hosting SLA, for that matter), to ponder the difference between engineering and hope.

1 Comment »

Category: Infrastructure     Tags:

1 response so far ↓