The story of UCLA's "pod" began in January 2008. At that time, 18 clusters of high performance computing were being delivered from two primary locations, the Math Sciences data center and the Institute for Digital Research and Education (IDRE) data center. Between the two, researchers could run jobs on any combination of about 800 nodes. But capacity was fairly well tapped out, according to Bill Labate, director of UCLA's Academic Technology Services and managing director of IDRE.
IDRE works as a kind of shared service. "People 'buy' nodes and then we integrate them," said Labate. Technically speaking, users don't really buy the nodes and hand them over; IDRE does the purchasing, based on well researched specs that provide high value for a low price; operations are ultimately funded by the 175 research groups in 64 departments and centers on campus that partake of the center's services and equipment.
A Plan for Growth
Given that the IDRE center had physical space available, the Institute put together a proposal to increase computing capacity. It brought in a third-party data center engineering firm to do a quick estimate of what the project would take. The needs were substantial. The goal was to squeeze in 1,600 computer nodes; but the existing power infrastructure could only support about 400 nodes. Somehow, the center would have to be revamped to support the dramatically increased wattage required by those additional nodes and the cooling they'd require. The consulting firm's estimate came to $4.6 million. Based on that number, the university granted IDRE funds to do its build-out.
That's when IDRE knuckled under to sort out various configurations for the project. The center brought in another consulting firm to do a more detailed cost estimate. This time, however, the estimate for 1,600 nodes exceeded $7 million. A second proposal for an 800-node build-out came in at $4 million.
"We were stuck almost $3 million low in budget," Labate said. "We went through a whole series of [scenarios]: What if we only do this many nodes capacity? What if we go to these hot aisle containment systems? It got down to where it was almost ridiculous to do something with this brick and mortar data center."
Computing in a Box
At the same time modular data centers, also called "containerized data centers," were getting media attention. Sun Microsystems had previewed "Project Blackbox," in late 2006. In mid-2008 both HP and IBM began publicizing their modular data centers. All promised to reduce energy usage, a major consideration in data center expansion. According to a 2010 study by Gartner, energy-related costs account for about 12 percent of overall expenditures in the data center, a share that's expected to rise as energy costs themselves rise.
Labate began having conversations with people at other institutions that were trying these new kinds of containerized set-ups, including UC San Diego and Purdue University, both major research universities. "We started thinking, these modular data centers could be a viable alternative."
Due diligence led the university to settle on HP's pod for four compelling reasons: density, price, flexibility, and energy efficiency. The pod could contain 1,500 nodes, nearly the same count as the planned data center build-out; but the price would be only $2 million versus $7 million.
When IDRE was in the middle of its shopping, it discovered that not all modular data centers are alike. "Some of the modular data centers required you to have specific equipment--and as you can imagine, the equipment was specific to the vendor of that particular unit," noted Labate. "That would have forced us to standardize forever on that particular node, which is something we would never want to do. With the HP pod, we can put anything we want in there."
That flexibility is important. The various compute clusters on campus currently carry four brands of equipment in all kinds of configurations. But on a regular basis, IDRE goes out to bid on computing nodes to find the optimal combination of feature set for price. The current choice happens to be HP, he explained. "But that's not to say there might not be another vendor in the future that comes along that meets that price/performance curve."
As of August 2010, the minimum standards for those computer nodes are:
- 1U, rack-mounted; half-width preferred (two nodes sit side by side in the slot);
- Dual six-core 2.6 GHz Intel Core i7 or Xeon 2.66GHz CPUs;
- 4 GB of memory per core;
- 160 GB to 250 GB hard drive per node;
- A Gigabit Ethernet port, DDR/QDR InfiniBand interconnect, and PCI-Express slot; and
- Three-year warranty.
UCLA placed the order for an HP 40c pod in October 2010. The vendor could have shipped it within six weeks, but, as Labate pointed out, "You have to build a site to put this thing on." Rather than going through the exercise of accepting delivery of a 43,000-pound retrofitted cargo container, stashing it somewhere, then hauling it out again for final placement, the campus told HP to keep the pod in its Austin factory until the site preparation was done.
A crane lowers UCLA's pod into its new home, a former loading zone.
To support the 110,000 pounds the pod would weigh once the equipment was in place, workers poured a concrete slab that Labate estimated to be between two and three feet thick.
The portable data center in place. Loaded, it will weigh about 110,000 pounds.
Insulation in the Extreme
Once the site was done and the pod delivered, a crane lifted the pod off the flatbed and onto the slab, and it was ready to outfit. On the interior, the pod holds 22 racks, each 50U tall or about 88 inches, along the 40-foot wall. Opposite is a narrow aisle wide enough for accessing and moving equipment.
To control the climate, blowers on the ceiling force air conditioning downward; hot exhaust goes out the back and rises up, enters the coolers, gets cooled, and sinks down again. Equipment located on the "hot aisle" side of the pod, where all the hot exhaust blows, is actually accessible from the outside. The pod is outfitted with sensors for temperature, chilled water supply, and humidity, as well as leak detectors under the water mains and overhead condensate drip pan.
Describing the pod as "really solid," Labate observed, "When you walk inside, with all of this equipment running and 36 blowers--it's extremely loud. You go outside, and you can't hear a thing. You can't feel anything on the outside. You don't feel cold when you touch the metal."
He estimated that the highly maximized environment saves about a third more power than that which would have been required by a brick and mortar data center.
"It's very Spartan," Labate said. "It's a purpose-built data center with extremely tight engineering for the purpose of being highly energy efficient. It's not something you want 10 or 15 people to try to get into. It's strictly made for the equipment, not the people."
Proximity to the other data centers was fairly unimportant to the placement decision, Labate said. All of the centers are linked over the campus networking backbone for Ethernet connectivity and interconnected by wide area Infiniband for input/output.
Because the pod's monitoring systems--both environmental and on the computing gear--can be managed remotely, Labate anticipated weekly visits to the pod to handle work that needs to be done. "The No. 1 thing is keep as many nodes up and running as possible. If we have a catastrophic failure, we're going to go fix it. But if a hard drive goes out or memory goes bad, we'll queue those up and send somebody out there once a week."
Mishaps and Skeptics
In the 10 weeks the pod has been in place, IDRE has loaded about 250 nodes into the racks--between 15 percent and 20 percent of capacity--which is just about where it needs to be right now. During a recent check, Labate said, the entire system was running at 95 percent--about 733 jobs--across all three data centers. "When you run on this system, you have no idea where you're running, nor do you need to know. That's all handled in the background with a scheduling system."
But there's still a bit of "teething" going on with the new environment. It hasn't been hooked into the campus alarm or fire systems yet. And, Labate added, more people need to get inside to learn about the new layout. In its short time on campus, that newness has already caused a mishap. Twice, the same vendor contracted to maintain the FM-200 fire suppression system has accidentally discharged the gas. "They know how to do FM-200. They're not familiar with the actual pod configuration."
As might be expected, the idea of a pod-based data center was off-putting to some of the data center technical crew, Labate said. "When the rubber meets the road, they have to be the ones to maintain it," he pointed out. "They have to work in it, put equipment in it."
IDRE sent a couple of people to the Austin factory to watch how the container was built. HP brought a pod to a local movie studio and members of IDRE took a piece of equipment there to see how it would fit. Eventually, the skeptics warmed up to the new approach. "They started to realize what it is," he said. "This isn't a building. It isn't a rehearsal studio. It isn't an office. It's a data center, period. That's all it is. It doesn't claim to be anything else."
So while most people were skeptical that this was actually a good decision, Labate noted, "in the end, it turned out to be probably the best decision we could have made."
Dian Schaffhauser is a writer who covers technology and business for a number of publications. Contact her at firstname.lastname@example.org.