Tuesday, January 27, 2009

Energy Management is a systems problem

Point-optimization alone will not fully minimize a data center's energy consumption. Rather, optimization is a "systems" problem, where all individual components, and their dynamics, have to be optimized together and usually continuously.

Probably the leading company in this space -- as-of today -- is Cisco. They recently announced plans to launch "Energywise" software for certain lines of switches. This software will intelligently manage energy consumption of devices, much the way laptops shut down subsystems when on battery, and the way PC power consumption is managed by companies like Verdiem and 1E. Driving this at Cisco are leaders like Paul Marcoux, who's been focused on these efforts for years. Others driving these moves include Robert Aldrich, also at Cisco, and frequent evangelist/blogger

Now granted, these solutions are "point" solutions... But Cisco went a step further today by acquiring privately-held Richards-Zeta. Take a gander:
Richards-Zeta's intelligent middleware transforms building operational data into an IT-friendly format that easily integrates with existing applications. Its scalable, open platform enables the convergence of building systems onto an IP network. This integrated solution provides more effective management of energy consumption across an organization.

Richards-Zeta's technologies will support innovative Cisco customer solutions such as Cisco Connected Real Estate and Cisco EnergyWise. EnergyWise, launched today in Barcelona, Spain, is a technology for Cisco Catalyst Switches that proactively measures, reports and reduces the energy consumption of IP devices such as phones, laptops and access points. Ultimately, Richards-Zeta's technology is expected to work together with EnergyWise and industry partner solutions to enable the management of power consumption for building and IT infrastructure.
This is clearly a play at looking at measuring and optimizing the data center's power & cooling as a system embedded in the larger building/facility.

The other big players in this game, but with much less experience in IT, are APC/Schneider Electric, as well as Emerson -- both known for their dominance in power distribution and cooling, respectively. But each has made bold moves into the others' space over the past years. For example, Aperture (a leader in data center measurement software) was acquired by Emerson last year. And APC is expanding its Infrastruxure line of cooling, enclosure and power management systems. Finally, look for advanced pure-play measurement, monitoring and analysis players like SynapSense to be cutting even more deals as the race to measure and control data center efficiency heats up.

Now, what's the holy grail? Conversations I've had with many of these vendors have yielded a number of practical examples:
  • As data center compute load falls (e.g. during evenings & weekends) servers are automatically consolidated, and unused machines are powered-down or hibernated. In concert, CRAC units, chillers and PDUs are also shut-down or re-configured to save power
  • When a "hot spot" in a data center is detected, either point-cooling is activated, or some of he workload is physically migrated to a server in a cooler area elsewhere in the data center
  • If one electric phase of power on a PDU is out-of-balance (drawing too much current) workloads running on servers on that phase can be migrated to servers on a different phase
  • Should a facility's power or cooling system partially fail, compute workloads can be temporarily minimized or migrated elsewhere temporarily
While most of these visions are pretty-well beyond what data center managers are worrying about today, I would expect to see products begin to evolve in this direction, particularly for larger facilities. I would also expect to see some blurring of the lines (as is already happening) between traditional facilities management firms and IT operations/management firms. They are finding an important interrelationship between IT operations and building facilities organizations/buyers -- and the important systems interrelationships as well.

Wednesday, January 21, 2009

Cisco, unified computing, and automated infrastructure

Unified computing. Infrastructure orchestration. Adaptive infrastructure. Converged networks.

If you haven't heard of these terms yet, you will. They're poised to dominate the post-VMware vocabulary. And Soon.

While the IT industry has been fixated on hypervisors for the past year or two, a new realization is emerging - and companies like Cisco, HP and Egenera are hot on its tail. Although VMs have radically improved software portability and machine utilization, there has been a less visible, less sexy issue restraining it. It's the fact that the *physical* IT infrastructure remains static, limiting the real value of the VMs above it. For example, heavily consolidated servers may require 4 or more NICs, each sitting on different physical networks. To migrate (or fail-over) the VMs, another physical server has to be
exactly pre-provisioned to take over. That's still a manual process (read: bad & expensive) and ties-up resources.

In a recent blog by Padmasree Warrior, Cisco's CTO, she hinted at Cisco's upcoming foray into "unified computing" and Cisco's expected announcement to begin selling servers with integrated and poolable VMs, processors, I/O and network. In this way, even I/O can be virtualized and instantly re-configurable. An excerpt:
"... the compute and storage platform is architecturally 'unified' with the network and the virtualization platform. What are the benefits in doing this? Virtualization architectures today are very much “assembly required” islands where the burden of systems integration is on the customer. This increases costs and deployment times while decreasing efficiency. Unified Computing eliminates this manual integration in favor of an integrated architecture and breaks down the silos between compute, virtualization, and connect."
Similarly, Egenera has been offering a relatively high-end product for some years now called PAN (processing area network) Manager. Now bundled with Dell servers, PAN Manager creates instantly-reconfigurable pools of diskless servers, virtualized I/O, and network fabrics. In this way, servers running native software -- and/or virtual hosts -- can be made to instantly scale, replicate, fail-over, etc. without having to physically re-configure NICs or HBAs, and without having to re-cable a thing. All through using converged networking fabric software and management console.

Recently HP jumped part-way into the fray with its Insight Orchestration and Insight Recovery components that leverage HP's own hardware to provide a degree of physical configuration management and HA specifically for for their Proliant hardware. Like Egenera, these products will be marketed to manage infrastructure for both physical and virtual payloads.

Observing these trends, James Staten of Forrester observed "Adaptive infrastructure no longer vision" as a result of HP, Cisco and VMware announcements. For example,
"[Cisco] clearly sees the network convergence 10GbE will bring as a catalyst for its vision which speaks with a familiar ring about the orchestration and composability of resources. And with rumors spreading about a potential play on the server side, Cisco is garnering mindshare well ahead of its ability to deliver.
And this mindshare will absolutely accelerate the market maturity of unified, adaptive, orchestrated infrastructure. (Interestingly-enough, Staten also observed that the visions put forward by Cisco, HP and VMware are all proprietary. The Egenera/Dell play, however, is not.)

Thomas Bittman of Gartner nearly simultaneously observed the same, and later placed this move toward infrastructure managment in the context of cloud computing infrastructures. Speaking of the industry's direction,
"What is also apparent is there are many vendor attempts to achieve this, and they all bring their current strengths and products to bear to unify a portion of the fabric. I believe Cisco’s announcement may be “one large step for a vendor, one small step for vendor-kind”. It is safe to say this will be big for Cisco – and big for unifying networking and computing – but it may not be a huge state of the art shift for the industry. It is good to see Cisco aggressively joining the club of vendors pushing the state of the art in infrastructure forward, however.
Finally, you'll begin to hear more about some of the components making this possible - converged network adapters, such as those available from QLogic and Emulex. Although some similar technologies are already embedded within certain blade architectures, as well as in software such as that available from Egenera.

But what is obvious is that managing (dare I say virtualizing) physical infrastructure components (I/O, networking, storage connectivity) will be the simplification step that will provide much of the "break-out" strategy for IT Ops simplification and cost reduction. And, they are the perfect complement to mixed virtual and physical infrastructure.

The terms may not be at front-of-mind today, but you'll see them sooner than you know.

Monday, January 19, 2009

The ultimate consolidation experiment?

Here's one that could be for the records.

I was speaking with Egenera's CTO last week, and he mentioned -- in passing, and without too much fanfare -- that a customer was experimenting with PAN Manager and VMware ESX to see how many guests a single BladeFrame blade could support.

In this case, the hardware was a single Egenera blade with 96GB of memory w/four 6-core Intel chips. The customer was able to load-up and run ~180 VMware guests, each with 512MB of memory. They then ran their own disk I/O tests (to generate I/O) with levels similar to applications they run in-house. In practice, they said, they found *180* to be an upper threshold, and that the more reasonable number that could operate without significant delay was *reduced* to 150.
Yikes.

Similarly, on a 32GB blade, upper limit was ~50 VMs, with ~40 being a reasonable operational number.

Obviously, other customer mileage will vary, and this customer's tests weren't based on standard benchmarks. And, clearly with very large apps each with high I/O, this may not be a reasonable number. But it was with these folks.

Now here's an interesting follow-on thought experiment: PAN Manager + hardware is designed to manage up to 24 physical blades (or Dell servers) meaning it dynamically creates I/O, manages the network fabric, and makes appropriate storage connections (for either physical or virtual apps). Doing-the-math shows that it's not a pipe dream to consolidate a modest-sized data center into a single PAN-managed rack. Cool.

Monday, January 12, 2009

IT's Big Blind Spots for 2009 (Volume 2)

Last week I wrote about observations of "Big blind spots" I've noticed that IT Operations -- and vendors -- suffer from. My opinion is that these blind spots are largely due to marketing hype around the more glitzier products and technologies - to the demise of data center operations. They still may not recognize where the biggest unsolved problems still lie.

Without being too provocative, I'll try to highlight some observations I've made during discussions with analysts, customers and end-users. During the past few months, it's become clearer where the industry is still suffering from the BSSs (big blind spots), or at least, from chronic myopia. Knowing of the blind spots makes for better decision-making, and hopefully, better products.

1. The industry assumes “agility” =“virtualization”
This is plain misleading. True, virtualization of software & OSs (via hypervisors or containers or what have you) yields significant mobility benefits. But this agility is at the software level only, and limited by certain factors.

Here's the Big Blind Spot: Virtualization vendors fail to mention the manual administration needed for physical infrastructure. Take, for example, a consolidated server that has a dozen VMs on it. It's probably been outfitted with 4 or more NICs, each of which could sit on different VLANs. So, if you want to have a failover or DR strategy for this server, or you want to migrate VMs off of this server, you're screwed unless you have another identical physical server pre-configured as a host... *including* the 4 or more identical NICs already inserted and ready-to-go. So the "agility" claim for virtualization comes with a caveat -- that your physical hardware, I/O and networking is agile too. hmmm.

2. The industry assumes “virtualization” = “simplification”
We've heard all of this before. There is certainly simplification created in the ability to *create* new virtual servers. From a development & test perspective, this is a huge breakthrough for developers needing to build-up and tear-down resources.

But here are the Big Blind Spots: As many have begun to point out:
(a) virtualization creates more objects to lifecycle manage, more to layer-on security, and more to simply account for. Sure there are management tools out there, and automated tools are on the way. But nothing changes the growing "VM sprawl"
(b) consolidated servers require more I/O per physical server. As I pointed out above, you'll find that NICs, HBAs, and cabling density will probably also increase. So will your networking headaches.
(c) virtualization puts more VMs at risk if/when HW fails. Yep; this can be solved for (see below) but it doesn't necessarily bolster the idea that virtualization simplifies.
(d) virtualization of part of your data center means that you now have at least two managment silos... those for your virtual infrastructure, and those for your physical servers. That doesn't bolster the simplification argument, either.

3. The industry associates “provisioning” with “software”
This one really annoys me. The high profile created by Opsware (now HP) and BladeLogic (now BMC) associated "automated" provisioning with software. True, there have been huge steps in the past few years that advanced configuration control and provisioning of images to servers.

But consider this Big Blind Spot: You still have to provision the Iron. In a virtualized environment, every time anything but a relatively minor change happens, the physical infrastructure that has to follow it must change. When will physical provisioning of the NICs, HBAs, out-of-band management, network, and storage get the higher-profile that software provisioning gets? (Stay tuned regarding I/O virtualization)

4. The industry assumes HA & DR is solved by VMs
True, that the #1 or #2 reason virtualization is being adopted is for its ability to do HA... that is, if the software fails, another virtual host can be tapped for the job. Same goes for DR - products like VMware SRM and other virtualization providers aspiring to break into the DR space as well.

But take note of IT's Big Blind Spot: This assumption presumes that (a) 100% of what you want to provide HA/DR for is 100% virtualized, (b) all apps must be virtualized on identical VM vendor technology, and (c) recovery equipment has to be pre-provisioned with VM software and pre-configured nearly identically to the primary servers. As an example, consider an SAP (or other composite application) implementation. You've got a bunch of servers/services, possibly including DBs. If they are certain Oracle database servers, you *can't* virtualize them (thanks to licensing restrictions). So, to cover this SAP app with HA, you're either screwed, or need to use two or more HA products to cover it. Net-net: for certain environments, VM-based availability is certainly a help, but don't "drink the cool-aide" that it's a panacea.

-----

Now, all is not lost. Looking across these "blind spots" it becomes pretty clear that the limiting factor is the infrastructure's ability to adapt to the changing software workloads placed on it.

Mobility and agility has been addressed "above" the hardware by VMs and containers. Now mobility and agility have to be addressed "below" the hardware -- by virtualizing and/or orchestrating I/O, NICs, HBAs and the network. The market is beginning to produce point-products to solve for these issues, and vendors like Egenera have been integrating them into orchestration products for some time now.

Stay tuned for a review of the technology market that will perfectly complement virtualization in 2009 and beyond: infrastructure virtualization and orchestration.

Monday, January 5, 2009

IT's Big Blind Spots for 2009 (Volume 1)

I've been spending the past 60 days or so surveying the industry, working with analysts, and speaking with IT users. And the more I look, the more I'm finding Big Blind Spots (BBSs) in lots of IT market areas. Depending on how you look at them, they're either big opportunities for the right players in '09, or brick walls fast approaching IT from all sides.

Saavy customers/users have picked-up on some, but the mainstream hasn't -- yet. My theory is because parts of the market, like virtualization, are so white-hot (the "shiny metal objects" that transfix gaze) that people just miss the other components of reality. Plus, vendors' marketing tends to highlight their own products, while downplaying (or omitting) harsher realities.

Follows are some industry observations, and the darker BBSs that follow:
Observation 1: Virtualization is booming! A huge percentage of enterprises have now dipped their toe in the water to experiment with virtualization - indeed it has helped consolidate physical servers, and provides a degree of HA. This must be good for the industry, right? BBS: One of Virtualization's best-kept secrets (but known to IT OPs) is that VM tools still don’t operate on the physical network - switches and VLANs still need to be hand-set. Worse, server I/O is still physical, and expensive to aquire. If you're going to put 20 VMs on a box, how many NICs and cables will you need to buy and provision? This is still a huge complaint from IT.

Observation 2: Management tools are booming! Never have I seen more enterprise management tools with more capabilities being layered-on. Doesn't this mean an easier life? BBS: But complaints I hear from customers are that in the new world of virtualization, there are now "management ghettos". IT is still either buying tools for managing "legacy" (read: Physical) management, or newer tools for Virtual management. This isn't simplifying... it's adding silos to mangement.

Observation 3: HA/DR is mature technology... There are HA/DR solutions from hardware vendors, software vendors and virtualization vendors. Great, right? BBS: The truth is that traditional HA/DR systems are mostly still closely-bound to hardware or applications... and they come at steep prices. While VM-based HA give increased flexibility, it assumes that 100% of the protected infrastructure is already virtualized (from a single vendor, I might add) and is pre-configured with similar I/O and network connections. And finally, with either the dedicated or virtual solution, you're tying-up recovery resources by dedicating with VM host licenses or duplicate application licenses. I'm already hearing that hardware savings from consolidation are being reduced by having to pre-provision recovery servers that sit aroung using Watts and Licenses.
As I said, I've found that the more savvy users I've spoken with already acknowledge the observations, and have taken actions around the Big Blind Spots.

But in most instances, the BBS's are results of not paying attention to the *physical* infrastructure underlying most higher-level systems.

For example, we'll see network automation and compute fabrics solve many of the physical NIC, HBA and networking issues in the near future. We'll also see Infrastructure Orchestration tools provide a highly reliable/dynamic server, I/O, Network and storage infrastructure on which management tools can place physical or virtual instances. And, we'll see the same Physical Infrastructure tools provide HA and DR on entire P & V infrastructures (that might already be supporting VMware, Citrix/Xen and HyperV). These tools can already provide complete recovery environments on true bare-metal resources (servers, network, etc.).

For every panacea, there's usually a Gotcha. So true for the 3 "shiny metal objects" above. More revelations on other Big Blind Spots soon :)