Monday, May 20, 2013

What Clouds Will Form Around Data's Gravity?

The concept of Data Gravity posits that as data accumulates (whether it be stored, analyzed, used) it tends to attract even more similar data. And as the data amasses there is less likelihood that it will be moved/migrated elsewhere.  If you're not already familiar with the concept definitely check out Dave McCrory's excellent blogs, analyses and presentations on

I believe this data aggregation concept can also apply to attracting computing too. As I've mentioned in It will be a data-centric (cloudy) world, there are examples today where special-purpose compute clouds are already forming around special-use data sets... sometimes intentionally, sometimes organically. One example I frequently point out is the NYSE Capital Markets Community Platform - a special-purpose cloud computing environment formed with a massive market trading data set at its core.

I am increasingly asked by service providers and enterprises alike, what other businesses and special-purpose clouds might form around data?  What new clouds (and associated business models) might we build and monetize? How can we better serve vertical market needs in the Cloud?

Forming Community Clouds - Applied (vs. Theoretical) Data Gravitational Theory

After more thinking and conversations with experts on the topic, I wanted to offer some examples and ideas that I hope trigger further exploration by cloud- and service providers. Perhaps there are (or will be) new businesses based on some of these ideas of attracting data and computing.

Financial Services Community Cloud: as I've mentioned, the NYSE CMCP has at its core a huge database of stock market history.  It's natural attractor for trading firms and hedge funds to co-locate their compute loads near this data as they test and refine trading algorithms and prediction methods. High-performance processors with low-latency connections to Wall Street don't hurt the model either.  Perhaps there are other forms of gravitational financial data (other markets?) that could attract similar compute clouds?

Photography/Imagery Community Cloud: More and more companies (Shutterfly, SmugMug, EverPic stock photography companies etc.) are in the business of warehousing photos - mostly for simple monetization. But some innovative photo data collections might take advantage of this and provide a co-located compute platform for ISVs to provide higher-level photo identification, cataloging, enhancement and even geo-tagging services.  Perhaps the compute services could take advantage of knowledge about the larger database of images that have been previously tagged or otherwise cataloged within the larger community.  [Bonus thought experiment: create a shared medical imagery cloud].

CRM and Customer Insight Community Cloud: Consider the amount of customer data located on and others. Now consider the amount of consumer behavior information collected by systems like Marketto and others.  What if one of these giants begins to acquire additional firms who house complementary marketing data - and begins to build valuable "big data" around customer behavior?  Much like, the customer data would attract even more marketing and consumer behavior application workloads, again attracting more data and workloads.

The Retail Community Cloud: Start watching what Walmart Labs and Nielson are doing in the Big Data and retail analytics space.  It would be but a small jump for either to create a retail cloud - centered around a huge (but perhaps anonymized) database of consumer purchasing patterns, geographies, pricing and outlets. Monetize it by allowing co-location of marketing analytics workloads from marketing firms seeking insights into better forms of micro-marketing, associative/recommendation sales, and other forms of retail analytics engines. All retail firms great-and-small would want a piece of that action.

The Energy Community Cloud: What would it be worth to amass data about energy consumption -- at the customer level -- across the country? Perhaps associate those users with industry/SIC codes, zip codes, electricity prices and/or electricity source renewabilty (or carbon footprint)?  No single utility has this data, but firms such as Enernoc monitor consumption data across the country. What if they developed a cloud that encouraged co-location of workloads and businesses which take advantage of this data - such as monitoring which businesses are really "greenest", which vertical industries are growing fastest, or where alternative energy sources would be most attractive. Add to the database information such as energy efficiency programs or overlay it with data about alternative (wind, solar, geo) energy generation. The data at the core could attract compute workloads for use by other energy, efficiency, and economic monitoring businesses.

And more clouds: As I've mentioned before,
I could see this transforming both the cloud service provider ecosystem, as well as entire industry groups. Consider new Cloud Service Provider models:  What if NOAA formed the Weather and Atmospherics Community Platform? If healthcare companies created federated Medical Records Community Platforms? If the USGS formed the World Geologic Community Platform? If other brokerages created equivalent capital markets platforms? 

Building a Community Cloud with Gravity
The next natural question I wonder is how one might go about building a community cloud or "special purpose" data repository and associated compute cloud - be it around a vertical industry or specialized data type. In my opinion there are a few necessary properties each cloud (business) would have:
  • Data sets that become more valuable as they grow and become more diverse - and of course which generate additional gravity of their own
  • Business models that monetize the data - and perhaps generate additional derivative data. (In some instances the data may need to be anonymized).
  • Co-located workloads that need to be co-located near the large (gravitational) data sets due to their frequent access 
  • Privacy, security and regulatory controls specific to the industry and/or data type and globally provided/reinforced
  • Industry-specific sales & marketing - presumably each community cloud would have appeal to specific verticals, markets or industry groups. Driving demand / awareness within these markets is of course critical.
If you know of community clouds based on data gravity, please share. In my opinion we'll see dozens of these special-purpose clouds form around data sets in the coming years.

For Further Reading:

Monday, May 13, 2013

A Tour of Switch's SuperNAP

I have to admit that I was prepared to be underwhelmed when I toured the Switch SuperNap in Las Vegas. After all, how impressive can a co-location facility be?  Just slab, cooling, ping, pipe and power.  Right?

With thanks to Mark Thiele (EVP Data Center Technologies) and Jason Mendenhal (EVP Cloud), I was able to spend some quality time in-and-around the facility (even inside one of the AC units) and got an education that mega data centers are much more than a structure to house servers.

First - let me give you some amazing first-impressions: Switch's SuperNAP is all about design and function. The structure, the architecture, the even the color scheme is all for a purpose - to communicate attention to detail. And while it might cost a bit more to color-code pipes and conduits, label every piece of equipment, architect custom lighting, or build with industrial design principles, it's clear that every item in the entire structure is there (and highlighted) for an intentional purpose.

And now for some observations and learnings:

Energy Efficiency
In all of the talk about energy efficiency, the SuperNAP averages a PUE  around 1.24 during the year. That means overall Switch's ability to minimize energy loss and to maximize cooling efficiency is extraordinary compared to most competitors... less than 25% of the total facility power is used for "overhead" operations, while the majority goes directly to the servers and equipment. That's a relatively rare feat these days, with the industry average of 2.9, and only 20% of data centers scoring better than 2.0 according to a Digital Realty Trust survey reported by the Data Center Journal.

Cooling Options
Arguably the breakthrough for the design of the datacenter is Rob Roy's breakthrough thinking about cooling. Traditional data centers take a "diffusive" approach to cooling equipment. That is, they place AC units throughout the data center, diffuse cool air (via raised floor) over everything, and allow the hot air from the servers to re-mix with the cooler ambient air.

But if you think about it, server racks should really be viewed as radiators (think about the radiator in your car). To achieve the best heat transfer efficiency, pull cool air directly through the radiator, and channel it directly back to the cooling device. And that's what switch does with their T-scif  (Thermal separate compartment in Facility). Cool ambient air is pulled through the server racks and channeled directly into a hot-air plenum. there is no mixing of the hot air w/cool. Think of this as hot-aisle/cold-aisle containment taken to the extreme.

Data Center Density
The notion of density makes sense along many dimensions. First, more server manufacturers are designing equipment as inherently dense - a packed Cisco UCS or Dell Blade enclosure can potentially pull 7kW or more. And there may be 4 or more of these units in a rack enclosure.... that means a given rack might consume as much as 15-25kw. (The SuperNAP is designed to support about 1.5kW/square foot, easily enough to handle a packed cabinet). But the beauty of density also helps with creating a larger heat differential aiding in heat transfer efficiency.

Supporting power and cooling density ultimately helps Switch because it means that a given customers needs less space - which equates to $ savings. Yet density also helps Switch due to the obvious efficiencies.

Co-Located Compute
Very early in the tour Jason pointed out the advantages of customers and partners co-locating their equipment within the SuperNAP.   With the physics of bandwidth and latency hard-at-work, it became clear that certain customers had an advantage to co-locate their private cloud infrastructure in the same data center as their public cloud partners (or other service provider customers). Apparently there were many examples of this co-located hybrid cloud approach at the SuperNAP. 

Co-Located Networking
The SuperNAP is literally a nexus of multiple Network Access Points - that's the NAP part - originally built by Enron. (In fact, it was pretty cool literally seeing the conduits come up out of the floor with the fibers inside!)  This fact provides users with advantages such as network redundancy as well as the ability for Switch to broker bandwidth at wholesale prices. They market this as the Combined Ordering Retail Ecosystem (CORE).

Physical Security and Disaster Preparedness
It wouldn't be complete if I didn't mention the physical security that Switch provides... something slightly out of "24".   While it's not appropriate to share details, suffice it to say that entry onto the campus, into the building, and around/within the cages was closely and carefully monitored and guarded.  The facility itself is secured in multiple zones, as-are the tenant areas.

But the other security blanket Switch offers is back-up power and cooling. Should there be an outage from the local utility, cooling is literally designed to "flywheel" for enough time for generators to kick-in.  And there is sufficient fuel on-site (and sources off-site) to maintain operations through even the fiercest unforseen disaster.

Example: Content Hosting and Distribution
Without naming names, a major content streaming company chose Switch for nearly all of the reasons above.  But what I found particularly  fascinating and compelling is that they even co-located their physical broadcast operations at the SuperNAP.  Picture aisles of million-dollar storage arrays pumping-out movies and live TV shows 24x7 across the country. Co-located is the nerve-center -- not unlike NASA mission control -- that monitors performance and delivery country-wide. Why Locate at Switch? Besides all of the efficiency and security aspects, access to nationwide backbone networks ensures maximum video performance.

Going forward, Switch will be more than doubling its capacity in the next year or so, building into new expanded facilities. And from the sound of it, much of the space is already spoken for.

Who says the data center isn't important? I certainly wasn't underwhelmed by this visit.

For More Information