McKinsey is claiming, in a report called Clearing the Air on Cloud Computing, that cloud infrastructure (specifically Amazon EC2) is as much as 150% more expensive than in-house data center infrastructure (specifically a set of straw-man assumptions given by McKinsey).
In my opinion, McKinsey’s report lacks analytical rigor. They’ve crunched down all data center costs to a “typical” cost of assets, but in reality, these costs vary massively depending upon the size of one’s IT infrastructure. They’ve reduced the cloud to the specific example of Amazon. They seem to have an inconsistent definition of what a compute core actually is. And they’ve simply assumed that cloud infrastructure gets you a 10% labor savings. That’s one heck of an assumption, given that the whole analysis is underpinned by that. The presentation is full of very pretty charts, but they are charts founded on what appears to be a substantial amount of guesswork.
Interestingly, McKinsey also talks about enterprises setting their internal SLAs at 99.99%, vs. Amazon’s 99.95% on EC2. However, most businesses meet those SLAs through luck. Most enterprise data centers have mathematical uptimes below 99.99% (i.e., calculated mean time between failure), and a single server sitting in one of those data centers certainly has a mathematical uptime below that point. There is a vast gulf between engineering for reliability, and just trying to avoid attracting the evil eye. (Of course, sometimes cloud providers die at the hands of their own engineering safeguards.) Everyone wants 99.99% availability — but they often decide against paying for it, once they find out what it actually costs to reliably mathematically achieve it.
In my December note, Dataquest Insight: a Service Provider Roadmap to the Cloud Infrastructure Transformation, I wrote that Gartner’s Key Metrics data for servers (fully-loaded, broken-out costs for running data centers of various sizes) showed that for larger IT infrastructure bases, cloud infrastructure represented a limited cost savings on a TCO basis — but that it was highly compelling for small and mid-sized infrastructures. (Note that business size and infrastructure size don’t correlate; that depends on how heavily the business depends on IT.) Our Key Metrics numbers — a database gathered from examining the costs of thousands of businesses, broken down into hardware, software, data center facilities, labor, and more — show internal costs far higher than McKinsey cites, even for larger, more efficient organizations.
The primary cost savings for cloud infrastructure does not come in the savings on the hard assets. If you do an analysis based on the assumption that this is where it saves you money, your analysis will be flawed. Changing capex to opex, and taking advantage of the greater purchasing power of a cloud provider, can and will drive significant financial benefits for small to mid-size IT organizations that use the cloud. However, a substantial chunk of the benefits come from reducing the labor costs. You cannot analyze the cost of the cloud and simply handwave the labor differences. The labor costs on a per-CPU basis do vary widely as well — for instance, a larger IT organization with substantial automation is going to have much lower per-CPU costs than a small business with a network admin who does everything by hand.
I’ve been planning to publish some research analyzing the cost of cloud infrastructure vs. the internal data center, based on our Key Metrics data. I’ve also been planning to write, along with one of my colleagues with a finance background, an analysis of cloud financial benefits from a cost of capital perspective. I guess I should get on that…