Methodology

How we measured this — and what it does not prove

Everyone claims AI power savings. Spark-XC proves them. That promise only means something if we are equally rigorous about the limits of our own evidence. Here is exactly how we measured the power deltas we publish, what an A/B run does and does not control for, and the boundary between what is established and what is not.

The protocol

A controlled A/B run on real hardware

The headline numbers come from a single, short A/B comparison run on production-class GPUs. We instrumented the run end to end, measured power at the GPU, and sealed the result into a tamper-evident Power Event Record.

Run parameters

Comparison design: A/B — same workload, same hardware, governed vs. baseline
Hardware: NVIDIA B200 and NVIDIA H100
Fleet: 8 GPUs
Duration: approximately 3 minutes
Telemetry: NVML / DCGM, sampled continuously
Measurement point: power measured at the GPU (board-level draw)

What we observed

B200: 38.8% delta — 949 W → 581 W per GPU
H100: 26.8% delta
Utilization essentially unchanged: 93.9% → 95.7%

Holding utilization steady matters: the power reduction was not bought by simply doing less work over the measurement window.

The delta, to scale

Before

949 W

After

581 W

↓ 38.8% measured power delta · utilization 93.9% → 95.7% held

Single ~3-minute, 8-GPU B200 A/B run — validation evidence, not a guaranteed steady-state saving.

What an A/B comparison controls for

Running the governed and baseline arms back to back, on the same physical GPUs, under the same workload, removes most of the confounders that make standalone power figures meaningless — hardware lottery, cooling state, ambient conditions, workload mix, and firmware revision are held constant across both arms. The delta is therefore attributable to the governed action rather than to differences in the test bed.

Bounds

What we controlled — and what we didn't

An A/B design is strong on attribution and deliberately narrow on scope. We are explicit about the narrowness.

Controlled

Identical hardware across both arms
Identical workload across both arms
Utilization held essentially constant
Continuous board-level telemetry on every GPU

Not controlled / not yet established

Short duration — roughly 3 minutes, not a sustained window
A specific workload at a specific operating point, not a representative production mix
A single run, not a distribution across many runs, days, or sites
Magnitude is observed evidence, not a steady-state average or a guaranteed figure

Transparency

Why we publish the limits

A proof company that hides the boundaries of its own data is just another vendor making claims. Publishing the limits is the point: if the only honest reading of our evidence is "Spark-XC measured and proved a real delta on real hardware in this run," then that is exactly what we will say — and nothing more.

We would rather understate and be verifiable than overstate and be unfalsifiable. Every number on this site traces back to a sealed Power Event Record you can inspect, and we invite scrutiny of the methodology, the telemetry, and the record itself. If you can break our chain of evidence, we want to know.

The line

What is established vs. what is not

Established

In this run, on this hardware, Spark-XC measured a real power delta with utilization held essentially constant — and sealed that measurement into a verifiable, SHA-256 hash-chained Power Event Record. The action was approved, safe, auditable, and financially real, and anyone can reconstruct and verify the record. Spark-XC can detect, attribute, and prove a power delta on production GPUs.

Not established

That any specific percentage — 38.8%, 26.8%, or any other — is a guaranteed, steady-state, production-average saving. A single ~3-minute run on one workload is validation evidence that the measurement and proof machinery works, not a forecast of long-run results in your environment. Your delta is something Spark-XC measures for you; it is not something we promise in advance.

Mission Control executes. Spark-XC validates. The validation is the product — and the honesty about its scope is part of the validation.

See and verify a Power Event Record → Request a Power Event Replay