How we measured this — and what it does not prove

Everyone claims AI power savings. Spark-XC proves them. That promise only means something if we are equally rigorous about the limits of our own evidence. Here is exactly how we measured the power deltas we publish, what an A/B run does and does not control for, and the boundary between what is established and what is not.

A controlled A/B run on real hardware

The headline numbers come from a single, short A/B comparison run on production-class GPUs. We instrumented the run end to end, measured power at the GPU, and sealed the result into a tamper-evident Power Event Record.

Run parameters

  • Comparison design: A/B — same workload, same hardware, governed vs. baseline
  • Hardware: NVIDIA B200 and NVIDIA H100
  • Fleet: 8 GPUs
  • Duration: approximately 3 minutes
  • Telemetry: NVML / DCGM, sampled continuously
  • Measurement point: power measured at the GPU (board-level draw)

What we observed

  • B200: 38.8% delta — 949 W → 581 W per GPU
  • H100: 26.8% delta
  • Utilization essentially unchanged: 93.9% → 95.7%

Holding utilization steady matters: the power reduction was not bought by simply doing less work over the measurement window.

The delta, to scale

Before
949 W
After
581 W
38.8% measured power delta · utilization 93.9% → 95.7% held

Single ~3-minute, 8-GPU B200 A/B run — validation evidence, not a guaranteed steady-state saving.

What an A/B comparison controls for

Running the governed and baseline arms back to back, on the same physical GPUs, under the same workload, removes most of the confounders that make standalone power figures meaningless — hardware lottery, cooling state, ambient conditions, workload mix, and firmware revision are held constant across both arms. The delta is therefore attributable to the governed action rather than to differences in the test bed.

What we controlled — and what we didn't

An A/B design is strong on attribution and deliberately narrow on scope. We are explicit about the narrowness.

Controlled

  • Identical hardware across both arms
  • Identical workload across both arms
  • Utilization held essentially constant
  • Continuous board-level telemetry on every GPU

Not controlled / not yet established

  • Short duration — roughly 3 minutes, not a sustained window
  • A specific workload at a specific operating point, not a representative production mix
  • A single run, not a distribution across many runs, days, or sites
  • Magnitude is observed evidence, not a steady-state average or a guaranteed figure

Why we publish the limits

A proof company that hides the boundaries of its own data is just another vendor making claims. Publishing the limits is the point: if the only honest reading of our evidence is "Spark-XC measured and proved a real delta on real hardware in this run," then that is exactly what we will say — and nothing more.

We would rather understate and be verifiable than overstate and be unfalsifiable. Every number on this site traces back to a sealed Power Event Record you can inspect, and we invite scrutiny of the methodology, the telemetry, and the record itself. If you can break our chain of evidence, we want to know.

What is established vs. what is not

Established

In this run, on this hardware, Spark-XC measured a real power delta with utilization held essentially constant — and sealed that measurement into a verifiable, SHA-256 hash-chained Power Event Record. The action was approved, safe, auditable, and financially real, and anyone can reconstruct and verify the record. Spark-XC can detect, attribute, and prove a power delta on production GPUs.

Not established

That any specific percentage — 38.8%, 26.8%, or any other — is a guaranteed, steady-state, production-average saving. A single ~3-minute run on one workload is validation evidence that the measurement and proof machinery works, not a forecast of long-run results in your environment. Your delta is something Spark-XC measures for you; it is not something we promise in advance.

Mission Control executes. Spark-XC validates. The validation is the product — and the honesty about its scope is part of the validation.