The next generation
of AI cooling
Where cooling stands today
Data centers used to be cooled with cool air circulating through a data hall. But air can't keep up with modern GPUs densities, so today's deployments have moved to liquid cooling - also known as single-phase direct-to-chip liquid cooling: water, or water-glycol, is pumped through chambers called cold plates mounted directly on the GPU. It has been widely adopted in the past 2 years, powers the current generation of GPUs, and has caused a major change in the way data centers are built and operated.
Liquid cooling

Single-phase direct-to-chip liquid cooling moves heat away from a GPU using water pumped through a cold plate mounted directly on the chip. Inside the cold plate, heat transfers from the chip into the fluid, the fluid leaves the rack warm, and a coolant distribution unit (CDU) exchanges that heat with a facility water loop. The fluid then returns to the rack, cooled, and the cycle repeats.
Water carries roughly 23.5× more heat per unit volume than air, which means it can pull heat away from a chip far faster, at a fraction of the volumetric flow rate. That gap is what makes today's AI rack densities possible (single-phase systems are deployed at ~130 kW per rack today).
The architecture is mature. But densities are climbing faster than single-phase can follow, and the next architecture has to be ready before the next generation of silicon arrives.
Where single-phase reaches its limit
Single-phase carries heat through sensible heating: water enters cool, leaves warmer, and that temperature difference is the cooling capacity. The limit is set by physics; water can only carry so much heat per unit volume, the temperature delta across the chip can only be pushed so far before the required flow rates run into pump, manifold, and cold plate constraints.
The GPU roadmap is moving toward TDPs (thermal design power, the heat a chip generates and that cooling must dissipate) where single-phase cannot keep up.
| Generation | TDP | Cooling reality |
|---|---|---|
| NVIDIA H100 | ~700 W | Air at the limit |
| NVIDIA B200 / AMD MI300X | ~1,000 W | Air physically insufficient at density |
| NVIDIA GB300 / AMD MI400-Class | ~1,400 W | Liquid mandatory |
| NVIDIA Vera Rubin | Next gen | 100% liquid-cooled racks |
| Rubin Ultra / Feynman / Next-Gen AMD | Future | Two-phase architecture |
At the high end of this curve, the heat transfer coefficient of single-phase water cooling stops being enough.
The transition to two-phase
Instead of water cooling, two phase cooling puts a small refrigerator inside the cold plate on top of every GPU. Two-phase direct-to-chip cooling solves the physics with a fluid that boils on the chip. Instead of water, a two phase cold plate uses a liquid — the refrigerant — that boils at the operating temperature of the GPU, absorbing heat at near-constant temperature through phase change. Instead of heating the fluid up incrementally, the system uses the latent heat of vaporization, the energy a fluid absorbs when it changes from liquid to vapor.
kW
Per rack target envelope for next-generation AI silicon, where single-phase hits its physical limit.
x
Lower flow rates at the chip versus single-phase. Smaller pumps, less component wear, less plumbing under strain.
x
More heat absorbed by two-phase versus single-phase. The physics that unlocks the density.
But getting there is an engineering problem with steep requirements…
Dielectric fluid
The coolant has to be electrically insulating, so it can contact GPU components directly without short-circuit risk. Water can't.
Matched boiling point
The fluid's boiling point has to land at the target GPU operating temperature. Too low and the system never reaches steady state. Too high and the chip overheats before phase change kicks in.
Chemical stability
The coolant has to survive thousands of boiling cycles without breaking down, fouling cold plates, or losing its thermal properties.
Redesigned hardware
Cold plates, manifolds, CDUs, and quick-disconnects all have to be engineered for a fluid that changes phase, not one that stays liquid throughout.
Where this is heading
The cooling architecture that comes next has to be ready before the silicon that needs it. Intel advised infrastructure buyers to "think two or three generations ahead" when investing in cooling.
Rubin Ultra, Feynman, and the next generation of AMD accelerators will push TDPs to 2,000 W+ per chip. At that thermal envelope, single-phase cooling stops being viable.
Read about Orbital's two-phase work
Talk to our team
Tell us your deployment timeline, site constraints, and target rack density. We'll come back with a technical scoping session and a proposal designed for your deployment.
