Monday, February 18, 2008

Measuring "Useful Work" of a Data Center

I've been seeing this topic arise again and again everywhere that data center efficiency is discussed. Essentially, how do we measure the "useful output" of a data center, and then compare that against the Watts that went in? I'm constantly amazed at the opinions on the topic -- mostly driven by equipment vendors pushing their own particular CPU metric. But more on that later.

First-off, there are even different views on how much energy that "goes in" - Energy into the data center as a whole which includes cooling; energy that gets to the server, or even just the energy that gets the CPU. For a quick breakdown on this, there's this excellent paper from Emerson that analyzes the levels.

More important (to me) is, what's the "Work Output"? Vendors would have us believe that the way to look at it is CPU output - like the proprietary "M-Value" from Sun, or MIPS, or somewhat more generic metrics such as from
SPEC. But these performance benchmarks don't mean much to the average Joe, vary considerably from machine-to-machine, and are biased toward different types of compute loads. Using such granular metrics is akin to describing the performance of a car by saying it's (displacement) x (compression ratio) x (fuel injection factor). It's absurd and just doesn't help, esp. when all I care about is acceleration. Ergo every attempt to define data center efficiency by using granular numbers is doomed to fail... or worse, get bogged down by politics.

A better approach is to treat the server (and, in fact, the entire data center!) as a "black box". Let's not care what's inside, i.e.
servers, networking, storage (ahem, what's "under the hood"). There are just too many variables.

Instead, let's describe -- in the language of IT professionals and data center managers -- what the output of that black box is, (e.g. the car's acceleration) and what they really care about. They use the language of the "SLA", or Service-Level Agreement. Generically, an SLA is composed of something like:
- Type of application (e.g. Exchange)
- How many users (e.g. mailboxes)
- How many transactions (e.g. emails)
- How many files (e.g. archives)
- What response rate/time
- What level of availability

Now we have something useful. The above 6 (or so) pieces of data describe everything we need to know - and then we can measure them against Watts.

For example, there might be two different server/network/storage implementations for this Exchange installation, but each signs-up to the same SLA. Who cares how many MIPs the servers are capable of? If one approach draws fewer Watts than the other while meeting the identical SLA, it's more efficient. "But what about the Network HW?", "But what about storage and backup?" I hear you cry. Well, those issues are covered in the SLA under response rate and availability.

Here's an interesting illustration: "But I demand N+2 redundancy" you cry. Well, that would be covered under the SLA for availability. Now consider that there is more than one approach to get N+2 redundancy inside the "black box". The first method is to have 2 dedicated machines per each application. The other is to have a pool of shared servers standing-by. IMHO both will give you N+2, but the second approach is more energy-efficient while still delivering the SLA. Thus, for every SLA within the "black box", there are approaches with differing efficiencies. And yes, some vendors' hardware may be more efficient at some pieces of work than others... it's all part of the optimization IT management performs. Maybe one day vendor X will sell a server optimized for Exchange, while vendor Y will sell one optimzed for SAP. Cool idea.

I continue to see hand-wringing and debate at the Green Grid, DOE and EPA/Energy Star around measuring and comparing useful data center output. Seems to me they should just default to using the language of IT -- the application SLA -- and get out of their myopic conversations about hardware and architecture. They can't see the forest for the trees.

No comments: