The Power Crunch in the Data Center

Back when I was at Excite, I would often pat myself on the back for co-founding a company with a low environmental impact that “simply pushed electrons around a network.” Granted, a company whose only “product” is HTML-based web pages is less resource-intensive than say, an aluminum smelter or a strip-mining operation, but I wasn’t thinking critically enough about just how much energy a data center with thousands of servers can suck down.

And for years, computer manufacturers and CPU makers paid no attention to the profligate energy consumption of each successive generation of ever more powerful machines. Arguably, a driving factor in Apple’s switch to the Intel platform was the fact that IBM’s PowerPC chips were big power hogs that ran hot. Frustrated with IBM’s inability to deliver a version of the G5 chip that wouldn’t melt a laptop, Apple moved to Intel, whose chips crank out more MIPS per watt than IBM’s, though Intel still lags AMD in terms of compute mileage. And the increasing popularity of dense server configurations like blades has only made the problem worse as it has become possible to pack hundreds of CPUs into a single standard data center rack.

Fortunately, the technology world is waking up to this issue. Not because of altruism but because it impacts the bottom line of anyone who uses computers. And I’m glad we don’t have to rely on altruism alone, since market forces do a much better job of spurring action. Energy costs (for both the compute power and the HVAC required to cool down the computers) are the single biggest expense in any data center operation.

Google, owner of the biggest server farm on the planet, with hundreds of thousands of servers, feels this problem acutely, and may have the biggest electric bill on the planet, and therefore has gone to great lengths to reduce power consumption in their data centers, and is pushing for more electrical efficiency in PCs, while Jonathan Schwartz has been blogging regularly about power issues in the data center and their (very smart) focus on the energy efficiency of their servers. Certainly the performance relative to power consumption of the Niagara servers is quite compelling and can really cut down on the power density in a data center, provided your application isn’t heavy with floating-point operations, in which case the Niagra might not be the box for you. In fact, nearly a year ago, Google predicted that the lifetime cost of providing electricity to a server will eclipse the capital cost of the server itself.

Still, there’s nothing like a close-to-home experience to really drive this issue home. I’m on the board of Technorati, and they recently finished the painful task of transitioning their entire server farm to a new data center. Technorati’s old data center was over-provisioned and out of power. As Technorati planned to expand and occupy more rack space, their old provider was only able to offer half the power per new rack than they had been offering previously. Ouch. So the only option was to move to a new data center. But this is only a stop-gap measure, so Technorati is also busy evaluating more power efficient servers (Opteron-based and the Niagara, for instance) in an effort to continually increase MIPS/watt and, therefore, increase the compute density per rack in their data centers. Another anecdote from my portfolio was Postini’s experience a few years ago when they established data centers in Europe. They were surprised to discover that most European data centers offered substantially less power density per rack than was available in the US.

Finally, in talking to a friend of mine who builds and configures data centers for a living, he observed that the fundamental issue for many data centers is not so much the ability to bring enough power to each rack, but rather the ability to keep the building cool enough with all the equipment kicking off so much heat. Many of the buildings that house data centers are converted warehouses or were simply not built with a mindset designed around thermal management issues. Small design changes in buildings can have a huge impact on their power needs. So while the chipmakers and server makers have plenty of work to do, the folks who design data centers also need to do their homework to optimize the building for the application. The high (and growing) cost of electricity also makes me think that hosting providers like SolarHost might be on to something — investing in on-site power generation, be it photovoltaic, stationary fuel cells or whatever, could give a hosting facility a permanent operational cost advantage.

If you’re involved with a SaaS company or a web2.0 play (which is really just a SaaS company for consumers) make sure the engineering and operations team is thinking hard about these issues. Many such companies have probably already been forced to deal with hosting cost increases, power rationing in their data centers, or have dealt with downtime due to overheated machines. If it hasn’t happened yet, it certainly will soon, since the problem is only going to get worse over time. Increasing compute cycles per rack (while holding power consumption per rack steady or reducing it) has to be top of mind for the software and network operations teams of any such company.

Technorati Tags: , , , , , , ,