ToAIz

GB300 Cold-Plate Design Optimization

Prepared by ToAIz LLC · 2026

Overview

GB300-class accelerators dissipate on the order of 1.4 kW at die heat fluxes near 90 W/cm², and the microchannel cold plate that removes that heat is a primary thermal bottleneck. Its performance is governed by the channel geometry, which trades two competing objectives. This study maps that trade-off for a GB300-class single-die cold plate using conjugate CFD inside a multi-objective optimizer, and identifies the optimal channel geometries along the front.

Two metrics are used throughout:

Parameters. The study uses a representative public GB300 specification — 1400 W over ~1600 mm² (87 W/cm²), a 25 °C inlet, and 20 L/min — together with a representative conjugate stack-up in which some layer thicknesses are placeholders rather than vendor values. The transferable results are therefore the relative improvements, the shape of the front, and the geometric and operating-point sensitivities; the absolute R_th firms up once the real specification is supplied.

The trade-off front

Because R_th and Δp oppose each other, the optimizer returns a Pareto front rather than a single optimum: for any pumping budget, the lowest achievable R_th, and conversely.

Pareto front: R_th vs. Δp Front colored by channel width

Each row below is one point on the front:

Δp (kPa) R_th (K) w_ch (mm) h_ch (mm) w_fin (mm) N ≤ 1.5 bar
2.6 49.3 0.581 5.60 0.399 73
2.9 49.3 0.595 5.98 0.589 60
3.0 47.9 0.526 5.74 0.388 78
3.4 47.7 0.528 5.96 0.539 67
3.6 45.8 0.453 5.92 0.346 90
4.0 44.9 0.380 5.97 0.203 123
4.9 44.3 0.338 5.66 0.151 147
5.2 44.2 0.402 5.89 0.417 87
6.5 42.3 0.319 5.98 0.254 125
9.3 40.9 0.273 5.21 0.187 156
16.6 39.7 0.257 4.00 0.251 141
19.9 38.8 0.205 4.04 0.131 214
28.9 38.3 0.213 3.91 0.297 141
38.2 37.5 0.175 2.79 0.102 260
47.3 36.8 0.157 2.85 0.101 278
74.4 36.0 0.139 2.52 0.111 287
108.5 35.2 0.120 2.33 0.100 327 ✓ (best in budget)
151.5 34.7 0.112 2.10 0.116 315
181.1 34.6 0.109 2.17 0.148 280

Inside the optimum: the conjugate temperature field

The figure is a streamwise (x–z) slice through the channel centerline of the in-budget optimum. Reading bottom to top, it passes through the conjugate stack: the heated silicon die at z = 0, the copper integrated heat spreader (IHS), the thermal interface material (TIM) bonding die to spreader, the copper cold-plate base, and the water channel. Flow is left (inlet) to right (outlet).

Temperature and velocity fields, optimized design

Two features are worth noting. The peak temperature is the die face (≈ 60 °C here), running slightly hotter downstream as the coolant warms. More importantly, the largest temperature gradient anywhere in the stack is across the ~0.1 mm TIM layer — it drops more temperature than the 3.5 mm of copper above it. Once the channel geometry is optimized, the TIM is the dominant term in the resistance budget. The velocity panel shows the laminar profile developing from the inlet toward the fully-developed core.

Where CFD is necessary, and where it is not

To establish where high-fidelity simulation is actually required, we built a closed-form model — a thermal-resistance network with Shah–London duct correlations and a fin-efficiency treatment — using the same geometry, conductivities, and fluid properties, and compared it against the CFD across all 40 designs.

Quantity Closed-form error vs. CFD Interpretation
Δp, fully-developed correlation 15 % (up to 40 % for short channels) systematic under-prediction
Δp, with entrance correction 1.5 % the gap is the hydrodynamic developing length
R_th, total 6.5 % agreement dominated by the shared conduction term
Convective resistance only 20 %, up to 75 % at wide, low-velocity channels the regime that requires CFD

A corrected correlation is adequate for Δp and for the high-velocity designs. In the conjugate, low-velocity, entrance-dominated regime it errs by tens of percent — it would have predicted the wide low-Δp designs roughly 14 K hotter than they run, and discarded viable candidates. The CFD is what distinguishes the two regimes.

CFD vs. closed-form resistance model

Reducing the simulation count

Because each CFD evaluation is expensive, the search is driven by Bayesian optimization: the surrogate model learns from each result and proposes the next design with the highest expected improvement. We further use the closed-form model as the surrogate's prior mean, so it learns only the residual between CFD and the analytical estimate rather than the full response. On this problem that cut the Δp-surrogate error from 59 % to 0.7 % at four evaluations and recovered roughly twice as much of the front early in the search — a usable front in about half the simulations.

Surrogate accuracy with and without the physics-informed prior

Warm-water operation and facility power

Hyperscale liquid cooling is moving toward warm-water supply (~45 °C), because coolant that warm can frequently reject heat to ambient without a chiller (free cooling). The penalty is reduced margin to the die temperature limit. Re-evaluating the front at a 45 °C inlet, the 85 °C die limit imposes R_th ≤ 40 K, which retains 9 of the 19 front designs and eliminates the warm-running, low-Δp tail.

Among the designs that remain die-safe, moving from 25 °C chiller-cooled operation to 45 °C free-cooled operation reduces total cooling power from roughly 352 W/GPU to about 54 W/GPU — on the order of 300 W/GPU, or ~21 kW per rack. The dominant term is the facility cooling coefficient of performance; the contribution here is supplying the die-safe, low-pumping channel designs that make 45 °C operation feasible in the first place.

Warm-water operating point

Convergence and validation

Scope

Next step

Given the actual die heat-flux map, TIM specification, and operating envelope, the same pipeline produces this analysis for the real part: the optimized channel geometry, the trade-off front, and the warm-water power case at the true operating point. We would be glad to run it.


Detailed method comparisons, convergence records, and the complete simulation dataset are available on request.

OpenFOAM® is a registered trademark of OpenCFD Ltd. This work is not endorsed by OpenCFD Ltd.

© 2026 ToAIz LLC. All rights reserved.

← ToAIz CFD