Validation Pipeline

Five validation paths. Anchored by one unit of proof.

Spark-XC sits above existing GPU, workload, facility, grid, and finance systems to validate, authorize, and prove AI power actions. Every governed action travels these five paths and emits a single Power Event Record — answering whether it was approved, safe, auditable, and financially real. Mission Control executes. Spark-XC validates.

View Architecture → See Use Cases

Validation Paths

Every path, in detail

Five validation paths → one Power Event Record.

GPU Telemetry Validation

The first validation path confirms the action took effect on real hardware. SPARK-XC captures pre- and post-action NVML and DCGM snapshots — power, clocks, and utilization — for the GPUs in scope, then compares them to what the action requested. The delta is the ground truth that a power action actually reached silicon.

Snapshots are read directly from NVIDIA's management interfaces (NVML) and DCGM, so the validation reflects the hardware's own reported state rather than what the controlling software believed it set.

Pre/post NVML & DCGM snapshots Power, clocks, utilization captured Confirms action reached real hardware Hardware-reported ground truth

GPU_TELEMETRY
> PRE: 949W
> POST: 581W/GPU
> SOURCE: NVML+DCGM
> TELEMETRY: VERIFIED

✓ VERIFIED

Workload / Scheduler Context

The second validation path attaches job and throughput context to the power action. SPARK-XC pulls workload state from the schedulers already running the infrastructure — Slurm, Kubernetes, and Run:ai — so every power action is interpreted against the jobs it affects rather than in isolation.

This is what lets an operator or auditor see that a power change held SLAs and throughput, not just that the number on the GPU moved. The action becomes legible in terms of the work it was governing.

Slurm · Kubernetes · Run:ai Job & throughput context Action interpreted against the work SLA-aware

WORKLOAD_CONTEXT
> SCHEDULER: Slurm
> UTILIZATION: MAINTAINED
> THROUGHPUT: REQUIRES WORKLOAD A/B
> SLA: NOT MEASURED

CONTEXT

Facility Power Correlation

The third validation path correlates GPU-side power with the facility. SPARK-XC reconciles what the GPUs reported against DCIM, BMS, PDU/UPS, and utility signals where those feeds are available — so a power action is confirmed not just at the card but at the rack, the room, and the meter.

This correlation is what makes a power action financially and physically real: the energy a GPU said it shed shows up downstream in facility and grid telemetry, where available, rather than living only in GPU counters.

DCIM · BMS · PDU/UPS · utility Rack, room & meter correlation Used where signals are available Physically & financially real

FACILITY_CORRELATE
> SOURCE: DCIM+BMS
> RACK Δ: -2.9kW
> UTILITY: MATCHED
> RESULT: CORRELATED

✓ CORRELATED

Policy / Approval Gates

The fourth validation path is where SPARK-XC authorizes the action. Every power action — whether from a user application, an operator, or an automated scheduler — passes through authority, rate, scope, and oscillation gates that can authorize, modify, defer, or block it before it is proven.

The gates encode who is permitted to act, how often, over what scope, and whether the action would induce unsafe oscillation. Every evaluation produces a record, so an authorization decision is itself part of the evidence.

Authority, rate, scope, oscillation gates Authorize / modify / defer / block Encodes who may act, and how Every evaluation recorded

APPROVAL_GATES
> AUTHORITY: CONFIRMED
> RATE/SCOPE: WITHIN
> OSCILLATION: NONE
> RESULT: AUTHORIZED

✓ AUTHORIZED

Tamper-Evident Evidence Chain

The fifth validation path commits the result as a Power Event Record — the atomic unit of proof. Telemetry, workload context, facility correlation, and the authorization decision are bundled into one PER that answers whether the action was approved, safe, auditable, and financially real.

Each PER is SHA-256 hash-chained to its predecessor on an append-only chain, so any insertion, deletion, or modification of a historical record is immediately detectable — and any single PER is independently replayable by an operator, auditor, or CFO.

Committed as a Power Event Record SHA-256 hash-chained, append-only Tamper-evident by design Independently replayable

POWER_EVENT_RECORD
> SEQ: 14820
> PREV: 2e57...419f
> HASH: f33c...8b02
> PER: COMMITTED

PER COMMITTED

Fault Scenarios

What the evidence shows when something goes wrong

SPARK-XC is built so that failures are still provable. Here is how the validation paths respond to representative fault conditions — and what the Power Event Record captures.

Scenario A

Action never reached hardware

A power action is issued, but the GPU never actually changed state. The GPU telemetry path compares pre- and post-action NVML/DCGM snapshots and finds the expected delta is missing — the action did not take effect on real hardware.

Telemetry validation flags the gap. PER records the discrepancy.

Scenario B

Action threatens a running job's SLA

A power action would cut throughput below a job's SLA. The workload / scheduler context path reads job state from Slurm, Kubernetes, or Run:ai and surfaces the impact before the action is proven, so it is interpreted against the work it governs.

Workload context surfaces SLA risk. Captured in the PER.

Scenario C

GPU-reported savings don't reach the meter

GPU counters report a power reduction, but the facility doesn't show it. The facility power correlation path reconciles GPU-side power against DCIM, BMS, PDU/UPS, and utility signals where available — and the rack-level delta does not match.

Facility correlation flags the mismatch. Logged in the PER.

Scenario D

Unauthorized or out-of-scope action

An action arrives that exceeds the caller's authority, rate, or scope — or would induce unsafe oscillation. The policy / approval gates authorize, modify, defer, or block it before it is proven, and the decision itself becomes evidence.

Approval gates block the action. Rejection recorded in the PER.

Scenario E

Tampering with the evidence chain

An adversary attempts to delete or alter a Power Event Record. Because each PER is SHA-256 hash-chained to its predecessor on an append-only chain, the change breaks the chain — the next record's hash no longer matches and the violation is immediately detectable.

Chain integrity violation detected. PER is independently replayable.

Scenario F

A validation source is unavailable

One or more inputs — a facility feed, a scheduler, or a telemetry source — is unreachable when an action is governed. SPARK-XC still emits a Power Event Record, marking which paths were validated and which evidence was unavailable, so the gap is itself part of the proof.

PER records partial validation. Nothing is silently assumed.

Validation Characteristics

Validation path profile

Validation Path

Evidence Source

What It Proves

When Source Is Unavailable

In the PER

1 · GPU Telemetry

Pre/post NVML & DCGM

NVML · DCGM

Hardware-reported power, clocks, util

Took effect

Action reached real hardware

Marked partial

Snapshot gap recorded, not assumed

Telemetry block

Pre/post snapshots committed

2 · Workload Context

Scheduler state

Slurm · K8s · Run:ai

Job & throughput context

SLA-aware

Action interpreted against the work

Marked partial

Context unavailable, flagged

Workload block

Job & throughput context committed

3 · Facility Correlation

Facility & grid signals

DCIM · BMS · PDU/UPS · utility

Rack, room & meter

Physically real

GPU power confirmed downstream

Used where available

Correlation skipped, recorded

Facility block

Correlation result committed

4 · Approval Gates

Authority · rate · scope · oscillation

Policy ruleset

Authorize / modify / defer / block

Authorized

Action was permitted & in scope

Default-deny

Unresolved authority blocks action

Decision block

Authorization decision committed

5 · Evidence Chain

Power Event Record

SHA-256 chain

Append-only, hash-chained

Auditable

Tamper-evident, independently replayable

Always emitted

PER committed even on partial validation

The PER itself

All paths bundled into one record

See the proof

Replay a Power Event Record for yourself

Walk through how a single AI power action is validated across all five paths and committed as one independently replayable Power Event Record. Mission Control executes. Spark-XC validates.

Request an AI Power Event Replay → View Use Cases