What GPU waterblocks does FormulaMod carry?

FormulaMod stocks full-cover GPU waterblocks for over 500 graphics card models, covering RTX 5090, 4090, 3090, RX 9070 XT, and H100/H200 GPUs. We stock over 1,600 water cooling products with same-week worldwide shipping from Guangzhou.

How do I choose a waterblock for my GPU?

Match the waterblock to your exact GPU model and PCB variant. Reference-design cards (Founders Edition, reference Sapphire/PowerColor) use universal reference blocks. Non-reference cards from ASUS ROG Strix, MSI Gaming X Trio, Gigabyte AORUS, and EVGA FTW3 require model-specific full-cover blocks that match their unique PCB layouts. Always verify your GPU's exact model number before ordering.

Will a GPU waterblock reduce noise during AI workloads like Ollama or Stable Diffusion?

Yes. Stock GPU air coolers run at 45-65 dBA under sustained AI inference loads. A full-cover waterblock mounted to a 360mm radiator with a quiet fan curve typically reduces noise to under 30 dBA. VRAM temperatures also drop from 90°C+ to 60-70°C, which prevents throttling during long generation runs.

What components do I need for a complete custom water cooling loop?

A complete custom loop requires: (1) a full-cover GPU waterblock, (2) a radiator — 360mm recommended for single-GPU systems, 480mm for dual GPU or CPU+GPU loops, (3) a D5 or DDC pump with reservoir combo, (4) G1/4 threaded fittings — compression fittings for soft tubing or hard fittings for PETG/acrylic hard tubing, (5) tubing, and (6) premixed coolant. FormulaMod sells individual components and complete kits starting at $249 USD.

Does FormulaMod ship internationally?

Yes. FormulaMod ships worldwide from Guangzhou, China. Standard shipping to the US, EU, UK, Canada, and Australia typically takes 7-14 business days. Express DHL/FedEx options are available at checkout for 3-5 business day delivery. All orders include a tracking number.

How long has FormulaMod been operating?

FormulaMod has operated as an independent water-cooling specialist since 2013. Continuous online presence is publicly verifiable via the Internet Archive Wayback Machine. FormulaMod is a U.S. registered trademark (USPTO Reg. No. 6073949) and ships worldwide from our Guangzhou warehouse with full manufacturer warranty.

GPU waterblock water cooling setup for AI workstation — GPU Undervolting + Water Cooling: The Efficiency Combo for A

GPU Undervolting + Water Cooling: The Efficiency Combo for AI Workloads

By Liang Huang, FormulaMod Technical Team · Published Apr 21, 2026 · Updated Apr 22, 2026

21. April 2026

The Power Problem with AI Workloads

An RTX 4090 at stock settings pulls 350-450W under sustained AI inference. An RTX 5090 pulls up to 575W. In a dual-GPU build, you are looking at 700-1000W just for the GPUs — not counting CPU, RAM, storage, and fans.

That power draw creates three cascading problems:

Heat: Every watt your GPU consumes becomes heat. 450W of heat in a PC case requires serious cooling capacity to manage.
Electricity cost: At the US average of $0.16/kWh, a 450W GPU running 20 hours a day costs $526/year in electricity. A dual-GPU rig costs over $1,000/year.
Connector stress: The 12VHPWR connector on RTX 4090 and 5090 carries all that power through a small interface. Higher current means more heat at the connector, and the 12VHPWR connector has a documented history of thermal failures under sustained high load.

Undervolting addresses all three problems simultaneously. And water cooling makes undervolting work better.

What Undervolting Actually Does

GPUs ship with voltage-frequency curves that guarantee stability across all silicon samples — even the worst ones from a production batch. Your specific GPU chip almost certainly runs stable at lower voltages than the factory default. The factory has to set the voltage high enough for the weakest chips; yours is probably better than that.

Undervolting means telling the GPU to use less voltage at a given clock speed. The result:

Less power consumed (voltage reduction has a squared effect on power: cut voltage by 10%, power drops roughly 19%)
Less heat generated (directly proportional to power reduction)
Same clock speed = same performance (if the undervolt is stable for your chip)

This is not the same as underclocking. Underclocking reduces both voltage and clock speed, which reduces performance. Undervolting maintains your target clock speed but supplies it with less voltage. If your chip can handle it (most can), performance stays identical.

Why Water Cooling Enables Deeper Undervolts

Here is where the two strategies compound each other.

GPU stability at a given voltage is temperature-dependent. As silicon heats up, it requires more voltage to maintain the same clock speed — a phenomenon called thermal voltage droop. This is why an undervolt that is stable at 50C might crash at 85C.

On air cooling, your GPU runs at 80-87C under sustained AI load. At those temperatures, your stable undervolt range is narrow — you might save 5-8% power before hitting instability.

On water cooling, the same GPU runs at 45-55C. At those temperatures, the stable undervolt range is significantly wider. You can push 10-15% lower voltage while maintaining the same clock speed, because the cooler silicon is inherently more stable at lower voltages.

The math compounds:

Configuration	GPU Temp	Stable Undervolt	Power Savings	Noise
Air cooled, stock voltage	82-87C	None	0%	50-65 dBA
Air cooled, mild undervolt	76-82C	-50mV to -75mV	5-8%	45-55 dBA
Water cooled, stock voltage	45-55C	None	0%	25-32 dBA
Water cooled, aggressive undervolt	40-48C	-100mV to -150mV	12-18%	22-28 dBA

The water-cooled + undervolted configuration uses 12-18% less power, runs 35-40C cooler, and is 20-35 dBA quieter than the stock air-cooled setup — with zero performance loss in AI inference throughput.

RTX 4090 Undervolt Guide (Step by Step)

This guide uses MSI Afterburner on Windows. Linux users can use nvidia-smi or CoreCtrl for similar results.

Step 1: Establish Baseline

Before undervolting, record your current performance so you can verify nothing is lost.

Open HWiNFO64 and monitor: GPU Core Clock, GPU Power, GPU Temperature, VRAM Temperature
Run your typical AI workload for 10 minutes (Ollama inference, Stable Diffusion batch, etc.)
Record the average sustained clock speed, power draw, and temperatures
Typical RTX 4090 AI baseline: 2500-2610 MHz, 380-430W, 80-87C on air / 45-55C on water

Step 2: Open the Voltage-Frequency Curve

Open MSI Afterburner
Press Ctrl+F to open the voltage-frequency curve editor
You will see a curve with voltage (mV) on the X axis and clock speed (MHz) on the Y axis
Each point on the curve tells the GPU: "at this voltage, boost to this clock speed"

Step 3: Set Your Target

Identify your target clock speed from Step 1. For example: 2550 MHz.
Find the point on the curve where 2550 MHz intersects with a lower voltage than the current setting. For a starting point, try -100mV from the stock voltage at your target clock.
Click on that intersection point, then press L to lock it.
All points to the right of this voltage will be capped at your target clock speed.
Click the checkmark to apply.

Step 4: Test Stability

Run your AI workload for 30 minutes
Watch for: driver crashes, black screens, application errors, or unusual output from your AI model
Monitor clock speeds in HWiNFO64 — they should match your target consistently
If stable: try reducing voltage by another 25mV and test again
If unstable: increase voltage by 25mV until stable

Step 5: Save the Profile

Once stable, save the undervolt profile in MSI Afterburner (one of the numbered slots at the bottom)
Enable "Start with Windows" and "Apply overclocking at system startup" in Settings
Verify that the undervolt persists after a reboot

Typical Stable Undervolt Ranges for RTX 4090

Cooling Method	Starting Point	Aggressive (Silicon Lottery)	Conservative (Safe for All Chips)
Air cooled	-75mV	-100mV	-50mV
Water cooled	-100mV	-150mV	-75mV

These are general guidelines. Every GPU is different. Some chips are "golden samples" that can run -175mV on water; others struggle at -75mV. The process is: start conservative, test, push further, test again.

RTX 5090 Undervolt Notes

The RTX 5090 uses the Blackwell architecture and a 575W TDP — making undervolting even more valuable than on the 4090. Early reports from the community suggest:

The voltage-frequency curve is wider, giving more room for undervolting
Typical stable undervolts on water cooling: -100mV to -175mV
Power savings of 15-22% are achievable with zero performance loss in AI inference
The 12V-2x6 (evolved 12VHPWR) connector benefits significantly from reduced current draw

For 5090 waterblock options and loop planning, see our RTX 5090 water cooling guide.

Power Savings Calculation

Here is the annual cost difference for common AI rig configurations, assuming US average electricity ($0.16/kWh) and 20 hours/day operation:

Configuration	Avg Power Draw	Annual Electricity Cost	Savings vs. Stock
RTX 4090 stock (air)	420W	$490	—
RTX 4090 undervolted (air)	385W	$449	$41/year
RTX 4090 undervolted (water)	350W	$409	$81/year
RTX 5090 stock (air)	545W	$636	—
RTX 5090 undervolted (water)	440W	$514	$122/year
Dual 3090 stock (air)	740W	$864	—
Dual 3090 undervolted (water)	600W	$700	$164/year

For a dual RTX 3090 NVLink build running 70B models (see our dual 3090 cooling guide), the water cooling + undervolt combo saves $164/year in electricity alone — nearly half the cost of the cooling loop itself. The loop pays for itself in electricity savings within 2 years, before counting the performance and noise benefits.

The 12VHPWR Safety Bonus

The 12VHPWR connector on RTX 4090 (and the updated 12V-2x6 on RTX 5090) has a well-documented history of failures under high sustained loads. The failure mode is thermal: current flowing through the connector generates heat at the contact points, and sustained high current can cause the connector to melt or burn.

Undervolting reduces the current flowing through the connector by the same percentage as the power reduction. A 15% power reduction means 15% less current through the 12VHPWR connector. That translates directly to lower connector temperature and less stress on the contact pins.

For AI workloads that run 20+ hours per day, this is not paranoia — it is risk management. The 12VHPWR failures documented by GamersNexus and other outlets almost all occurred under sustained high loads, which is exactly the operating profile of AI rigs.

Water cooling gives you the thermal headroom to push undervolts further, which pulls more current off the connector. Read our 12VHPWR safer build guide for the complete approach to connector safety.

Linux Undervolting

Many AI builders run Linux (Ubuntu, Arch, etc.) for better CUDA compatibility and container support. Undervolting on Linux requires different tools.

nvidia-smi Method

NVIDIA's system management interface supports power limit capping (not direct voltage control):

nvidia-smi -pl 350 — sets the power limit to 350W (from a 4090's default 450W)
The GPU will automatically reduce voltage to stay within the power limit
This is less precise than MSI Afterburner's voltage curve but works reliably on Linux
Combine with nvidia-smi -lgc 300,2550 to lock the GPU clock range

CoreCtrl (GUI Method)

CoreCtrl is a Linux GUI tool that provides voltage-frequency curve control similar to MSI Afterburner. It supports AMD and NVIDIA GPUs. Install it through your distribution's package manager.

Persistence Across Reboots

Add your nvidia-smi commands to a systemd service or rc.local script to ensure the undervolt applies at boot. For 24/7 AI rigs, this is essential — you do not want a power outage or automatic reboot to reset your GPU to stock voltage settings.

What If the Undervolt Is Unstable?

Instability from undervolting manifests differently in AI workloads than in gaming:

In gaming: Artifacts, screen flicker, driver crash, black screen
In AI inference: Subtle output corruption (wrong tokens, garbled text), CUDA errors in the terminal, application crash without visual symptoms, or the OOM killer terminating the process

This makes AI workloads harder to validate. A gaming benchmark gives you obvious pass/fail. An LLM generating slightly wrong text looks normal until you compare carefully.

Validation Strategy for AI Workloads

Run a known-output test: Use a fixed prompt with temperature 0 (deterministic output) and compare the result to a reference output generated at stock settings. If they differ, the undervolt is causing computation errors.
Run for at least 2 hours: Undervolting instability often manifests only after the GPU reaches thermal equilibrium, not immediately.
Test under peak VRAM load: Fill VRAM to capacity (load the largest model that fits). VRAM operations are more voltage-sensitive than core compute.
Check dmesg for NVRM errors: On Linux, dmesg | grep NVRM will show any GPU errors that did not crash the application but indicate instability.

The Efficiency Stack

Water cooling and undervolting are two layers of the same strategy: getting maximum AI performance from minimum power input. Here is the full stack, in order of impact:

Water cooling — drops temperatures 30-40C, enables everything else on this list, eliminates thermal throttling
Undervolting — reduces power 12-18% with zero performance loss (on water)
Optimized fan curves — with water cooling headroom, fans at 600-800 RPM maintain adequate cooling while producing near-zero noise
Power limit capping — if you do not need maximum performance, capping at 80% power (nvidia-smi -pl) with a moderate undervolt gives the best efficiency per watt for inference

A water-cooled, undervolted RTX 4090 running at a 350W power limit with fans at 800 RPM produces the same AI inference throughput as a stock air-cooled 4090 at 430W — while running 35C cooler, 30 dBA quieter, and consuming $80+ less electricity per year.

Frequently Asked Questions

Will undervolting void my warranty?

NVIDIA and most AIB manufacturers do not officially support undervolting, but they also do not detect or flag it. Unlike overvolting (which can cause physical damage), undervolting reduces stress on the chip. There are no known cases of warranty denial due to undervolting. That said, if you need to RMA your card, reset to stock settings first — it takes 30 seconds in MSI Afterburner.

Can undervolting cause data corruption in AI workloads?

An unstable undervolt can cause computation errors that manifest as subtly wrong AI outputs — garbled tokens, incorrect image generation, or silent numerical errors. This is why the validation strategy above is important. A stable undervolt (tested properly) does not cause data corruption. The risk is in the "testing" phase, not in daily operation once you find a stable point.

Does undervolting reduce GPU lifespan?

The opposite. Lower voltage means less current, less heat, and less electromigration. Undervolting extends the useful life of a GPU compared to running at stock voltage. This is well-established in semiconductor reliability literature — Intel and AMD both acknowledge that lower operating temperatures extend chip longevity.

Should I undervolt on air cooling or only on water?

Undervolting on air cooling is absolutely worthwhile — it reduces heat output and allows the air cooler to keep up better with sustained loads. The difference with water cooling is that cooler silicon is more stable at lower voltages, so water cooling lets you push the undervolt further. On air, a conservative -50mV to -75mV is a safe starting point. On water, you can often push -100mV to -150mV for larger savings.

That is the pitch for water cooling reframed: it is not just a thermal solution. It is the foundation of an efficiency strategy. Start with a Bykski RTX 4090 waterblock or browse our AI Workstation Cooling collection for a complete loop. For help choosing between pumps, see our D5 vs DDC comparison.

Zurück zum Blog

Artikel wurde in den Warenkorb gelegt

GPU Undervolting + Water Cooling: The Efficiency Combo for AI Workloads

The Power Problem with AI Workloads

What Undervolting Actually Does

Why Water Cooling Enables Deeper Undervolts