Skip to Content
MarketWatch

Nvidia's power-hungry chips could give a boost to this once-esoteric technology

By Therese Poletti

Cooling technology lowers temperatures of hot-running semiconductors - and improves energy efficiency

Liquid cooling, a term once only associated with the earliest mainframes and the most powerful supercomputers in the world, is quickly becoming an essential technology for data centers in the era of artificial intelligence.

As the number of semiconductors increases in each data-center server for AI applications, so does the amount of electric power consumed by each chip. The more powerful and high-performance the semiconductor, the more heat it generates as it processes training, inference or ChatGPT queries.

Earlier this month at Computex, a big trade show in Taiwan, Nvidia Corp. (NVDA) Chief Executive Jensen Huang showed off the company's new Blackwell system, with graphics processing units (GPUs) that use approximately 1,500 watts of electricity per chip, or the equivalent of a small heater in a home. But the Blackwell system needs at least four GPUs, and some configurations have eight.

"Think of four Blackwells, that's more than 6,000 watts, and think of the other content in the system, you are talking about a 7,000-watt server," said Peter Rutten, an IDC analyst. "Think of having a few of those in a rack, it quickly becomes a very power-consuming, heat-dissipating system."

So what many server makers and data centers are doing is going back to the concept of liquid cooling, where water or other coolants are brought into a server room via pipes and tubing. These coolants pass over the semiconductors, which are encased in heat plates or heat sinks, and lower the temperature of the devices, and save energy, avoid performance problems and outages.

"When you get to a certain amount of power density, you just can't move enough air to cool the chip anymore," said Jon Lin, executive vice president and general manager of data-center services at Equinix Inc. (EQIX), which operates 260 data centers around the world. "It's not about the temperature of the chip. It's just how much heat is it actually emitting around that."

Super Micro Computer Inc. was one of the first server makers to talk about liquid cooling in the AI era. In a keynote this month at Computex, Super Micro (SMCI) Chief Executive Charles Liang again touted the company's direct liquid cooling (DLC) offerings, and said it could save customers up to 40% in operating expenses through lower power consumption, and lower the CO2 emissions from data centers.

In the most recent earnings season, other companies started chiming in on the topic. Hewlett Packard Enterprise (HPE), which has more than 300 patents in direct liquid cooling - thanks in part to its acquisition of supercomputing pioneer Cray Research in 2019 - spent several minutes of its earnings call on the importance of liquid cooling and its own expertise in the arena.

"We have one of the largest manufacturing footprints for water cooling. And also we have one of the largest - if not the largest - services arm that knows how to maintain and run these systems at scale," HPE Chief Executive Antonio Neri told MarketWatch in a recent interview. "Why that's important is because what comes next in the never-ending acceleration of silicon is that when Blackwell comes online, which is 2025, you need 100% direct liquid cooling."

Also read: HP Enterprise may be an underrated AI play, and new deal with Nvidia could help.

Another IDC analyst, Sean Graham, said in an email that liquid cooling is a trending topic right how, and it is expected to be high growth, although he has not compiled the market data yet.

Some hardware companies like HP and Super Micro offer their own solutions. This week, Super Micro said it was adding three campuses to its already sprawling San Jose, Calif., headquarters focused on developing liquid-cooling solutions, from systems to racks to water towers. The company expects that liquid-cooled data centers will grow to between 15% and 30% of installations in the next two years, up from historically less than 1%.

Others, like Dell Technologies Inc. (DELL), offer their own options and work with companies like Green Cooling Revolution for immersion cooling. A mix of older companies and a few startups focus entirely on liquid cooling. Some started in the past 10 years or so, as bitcoin mining and blockchain data centers are also very compute-intensive and power hungry.

"People are thinking about investing in the cooling players as a picks-and-shovels opportunity around AI," said Clark O'Niell, managing director and partner at the Boston Consulting Group. "It is one of the derivative investments in AI."

Currently, the two main technologies are direct liquid cooling and immersion cooling, which is much more complex and requires immersing the entire server and its electronics into a bath of coolants. For example, privately held LiquidStack, based in Carrollton, Texas, and founded in 2012, offers both direct-to-chip liquid cooling and immersion cooling options. In March 2023, it announced an undisclosed amount of Series B funding from Trane Technologies, a global HVAC company.

Another more recent startup, Ferveret, based in San Jose, Calif., is trying to reimagine liquid cooling, inspired by nuclear power-plant cooling. The small company, founded by two thermal engineers in 2021, received pre-seed funding from Y Combinator and a later round of $2.1 million led by Cathexis Ventures, according to CrunchBase.

But as Evercore ISI analysts wrote in a note last month, after hosting a webinar for clients on the topic of liquid cooling, "the underlying technology is harder to differentiate from vendor to vendor." They also noted that "services capabilities remain a key differentiator, both pre- and post-deployment for server OEMs." That's where server makers like HPE and Dell, which have services business, could have an advantage.

The liquid-cooling trend is also adding another necessary technology into the mix at commercial data centers around the world, raising building costs. In addition, O'Niell said it is getting harder and harder to find power sources. In March, Amazon.com Inc.'s (AMZN) Amazon Web Services unit reportedly bought a data center in Pennsylvania, powered by nuclear energy, for $650 million.

"We spent a tremendous amount of time in the last two years specifically focused around productization of liquid cooling in our data centers so that we know how to operationally support that on the AI front," said Lin of Equinix, which has offered liquid cooling for several years. He said all its new data centers are being built from the ground up to support liquid cooling. Oracle Corp. (ORCL) co-founder and Chief Technology Officer Larry Ellison said in his company's recent earnings call that its new data centers are also being built with liquid cooling.

"These modern data centers are moving from air-cooled to liquid-cooled and you have to engineer them from scratch and that's what we've been doing for some time and that's what we'll continue to do," Ellison said.

The cost of adding liquid cooling to a data-center project is not cheap. But ISI Evercore analysts said customers have been realizing a savings of about 15% in operating expenses.

Neri of HPE described liquid cooling as adding another level of issues to manage data centers, where among other things, companies have to watch for corrosion in the water pipes and algae in the recirculating water.

"Think about it as an aquarium," Neri said. "It's a large aquarium. Just instead of fish, it has 60,000 GPUs."

-Therese Poletti

This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.

 

(END) Dow Jones Newswires

06-20-24 1505ET

Copyright (c) 2024 Dow Jones & Company, Inc.

Market Updates

Sponsor Center