The next generation
of AI cooling

Where cooling
stands today

Data centers used to be cooled with cool air circulating through a data hall. But air can't keep up with modern GPUs densities, so today's deployments have moved to liquid cooling - also known as single-phase direct-to-chip liquid cooling: water, or water-glycol, is pumped through chambers called cold plates mounted directly on the GPU. It has been widely adopted in the past 2 years, powers the current generation of GPUs, and has caused a major change in the way data centers are built and operated.

Liquid cooling

Thermal imaging of liquid-cooled GPU rack

Single-phase direct-to-chip liquid cooling moves heat away from a GPU using water pumped through a cold plate mounted directly on the chip. Inside the cold plate, heat transfers from the chip into the fluid, the fluid leaves the rack warm, and a coolant distribution unit (CDU) exchanges that heat with a facility water loop. The fluid then returns to the rack, cooled, and the cycle repeats.

Water carries roughly 23.5× more heat per unit volume than air, which means it can pull heat away from a chip far faster, at a fraction of the volumetric flow rate. That gap is what makes today's AI rack densities possible (single-phase systems are deployed at ~130 kW per rack today).

The architecture is mature. But densities are climbing faster than single-phase can follow, and the next architecture has to be ready before the next generation of silicon arrives.

Where single-phase reaches its limit

Single-phase carries heat through sensible heating: water enters cool, leaves warmer, and that temperature difference is the cooling capacity. The limit is set by physics; water can only carry so much heat per unit volume, the temperature delta across the chip can only be pushed so far before the required flow rates run into pump, manifold, and cold plate constraints.

The GPU roadmap is moving toward TDPs (thermal design power, the heat a chip generates and that cooling must dissipate) where single-phase cannot keep up.

Generation	TDP	Cooling reality
NVIDIA H100	~700 W	Air at the limit
NVIDIA B200 / AMD MI300X	~1,000 W	Air physically insufficient at density
NVIDIA GB300 / AMD MI400-Class	~1,400 W	Liquid mandatory
NVIDIA Vera Rubin	Next gen	100% liquid-cooled racks
Rubin Ultra / Feynman / Next-Gen AMD	Future	Two-phase architecture

At the high end of this curve, the heat transfer coefficient of single-phase water cooling stops being enough.

The transition to two-phase

Instead of water cooling, two phase cooling puts a small refrigerator inside the cold plate on top of every GPU. Two-phase direct-to-chip cooling solves the physics with a fluid that boils on the chip. Instead of water, a two phase cold plate uses a liquid — the refrigerant — that boils at the operating temperature of the GPU, absorbing heat at near-constant temperature through phase change. Instead of heating the fluid up incrementally, the system uses the latent heat of vaporization, the energy a fluid absorbs when it changes from liquid to vapor.

Physical limit

Per rack target envelope for next-generation AI silicon, where single-phase hits its physical limit.

Flow rates

0-0

Lower flow rates at the chip versus single-phase. Smaller pumps, less component wear, less plumbing under strain.

Heat absorption

0-0

More heat absorbed by two-phase versus single-phase. The physics that unlocks the density.

But getting there is an engineering problem with steep requirements…

Dielectric fluid

The coolant has to be electrically insulating, so it can contact GPU components directly without short-circuit risk. Water can't.

Matched boiling point

The fluid's boiling point has to land at the target GPU operating temperature. Too low and the system never reaches steady state. Too high and the chip overheats before phase change kicks in.

Chemical stability

The coolant has to survive thousands of boiling cycles without breaking down, fouling cold plates, or losing its thermal properties.

Redesigned hardware

Cold plates, manifolds, CDUs, and quick-disconnects all have to be engineered for a fluid that changes phase, not one that stays liquid throughout.

Where this is heading

The cooling architecture that comes next has to be ready before the silicon that needs it. Intel advised infrastructure buyers to "think two or three generations ahead" when investing in cooling.

Rubin Ultra, Feynman, and the next generation of AMD accelerators will push TDPs to 2,000 W+ per chip. At that thermal envelope, single-phase cooling stops being viable.

Read about Orbital IT's two-phase work

Talk to our team

Tell us your deployment timeline, site constraints, and target rack density. We'll come back with a technical scoping session and a proposal designed for your deployment.

The next generationof AI cooling

Where cooling stands today