Benchmark Matrix

NTX keeps validation claims in a maintained benchmark matrix. The matrix maps each claim or monitored stress lane to:

  • literature anchors,

  • scripts,

  • tests,

  • committed artifacts,

  • manuscript figures,

  • and non-promoted future work that must not be promoted yet.

Generate the machine-readable artifact with:

python scripts/build_benchmark_matrix.py

The default output is:

docs/_static/benchmark_matrix.json

The matrix has four maturity levels:

  • positive-gate: a promoted validation or transfer claim.

  • stress-gate: a monitored physics or workflow stress test.

  • software-gate: a software-performance or maintenance gate.

  • planned-lane: a literature-motivated lane that is intentionally not yet a validation claim.

Current machine-checked acceptance gates are:

Gate

Scope

Primary artifact

Monoenergetic validation summary

coefficient behavior, Onsager residuals, and Legendre convergence

docs/_static/validation_summary.json

Precise-QS Redl/SFINCS comparison

fixed-field Redl agreement on the interior benchmark window

docs/_static/bootstrap_current_fixed_field_validation.json

W7-X integrated transfer

imported workflow transfer on the rebuilt raw branch

docs/_static/bootstrap_current_reference_audit_w7x.json

Prepared derivative path

implicit-adjoint derivative agreement gate and timing evidence

docs/_static/derivative_path_benchmark.json

Geometry/boundary derivative agreement

finite-difference agreement on analytic, file-backed, boundary-projected, and explicit-relaxed derivative artifacts

docs/_static/*derivative_benchmark.json

Fixed-field NTX+NEOPAX closure stress

precise-QS total-current stress gate below 1e-1, scoped away from species-current parity

docs/_static/bootstrap_current_fixed_field_validation.json

Current software gates are:

Gate

Scope

Primary artifact

CPU/GPU throughput and strong-scaling characterization

serial, device-parallel, multiprocess, CPU, and GPU scan scaling on committed smoke, heavier, production, and fixed-workload strong-scaling grids

docs/_static/performance_strong_scaling_production.png

Prepared-geometry reuse profile

direct repeated solves, prepared geometry reuse, and compiled prepared steady-state reuse on one fixed geometry

docs/_static/prepared_geometry_reuse_profile.json

Current stress gates are:

Gate

Current Non-Promoted Scope

Fixed-field species-current closure parity

the total current passes the scoped stress gate, but species-resolved current decomposition and broader closure defaults remain reduced-closure issues

Synthetic inverse-design recovery

useful differentiable workflow check, but too small to be a research-grade geometry claim

Three-harmonic geometry-control derivatives

direct AD/finite-difference audit is now machine checked on an owned surface; reusable geometry families remain open

File-backed geometry-control derivatives

sample Boozer and VMEC files now pass machine-checked AD/finite-difference thresholds, but reusable geometry-family controls remain open

Boundary forward-mode current derivatives

low-dimensional boundary controls now pass the machine-checked boundary-projected finite-difference audit; self-consistent shipping claims use the explicit-relaxed lane

Implicit-equilibrium forward-mode derivatives

retained as a non-shipping diagnostic: equilibrium volume matches centered finite differences, but residual contraction and Boozer/NTX tangent parity do not pass

Explicit-relaxed boundary current derivatives

committed QA and QH cases now pass the machine-checked self-consistent forward-mode audit, but additional families plus reverse-mode equilibrium sensitivities remain open

Artifact-backed geometry-family breadth summary

analytic, file-backed, boundary-projected, explicit-relaxed, and implicit-volume derivative artifacts are summarized in one figure, while retired implicit Boozer/transport diagnostics are excluded from promoted geometry-family claims

VMEC geometry-family transport convergence

public VMEC example families now have a committed production-grid D11/D31/D33 convergence stress artifact with D13/Onsager diagnostics retained; independent-code parity and radial/electric-field/collisionality promotion remain separate gates

Same-coordinate Boozer-file round trip

generated boozmn surfaces now reload on VMEC half-grid coordinates and reproduce the in-memory vmec_jax -> booz_xform_jax -> NTX transport coefficients; VMEC-harmonic versus Boozer-coordinate comparisons remain representation audits

Finite-beta finalized-wout Boozer transfer

optimized finite-beta wout magnetic channels now transform and reload through the direct boozmn backend to roundoff on the same VMEC half-grid surfaces; the fully differentiable finite-beta state path remains non-shipping for unsupported current-profile representations

Profile uncertainty propagation

three-term radial-basis covariance propagation and Fisher/HVP consistency are machine checked; cross-geometry profile families remain open

Bootstrap-current optimization

machine-checked weighted-current improvement on the committed W7-X study, but not yet broad enough for a stellarator-design claim

Robust bootstrap-current optimization

useful robust-design stress test, but not yet broad enough for a promoted physics claim

Primitive-profile force reconstruction

literature-profile audit, currently monitored rather than promoted

Owned finite-beta JAX-native NTX+NEOPAX dataset provenance

finite-beta input/wout scan generation, physical VMEC edge-flux normalization in the Boozer path, and interpolation-path control are now artifact-backed; optimized finite-beta QH/QI Boozer reconstruction remains an explicit geometry-backend blocker

Owned finite-beta SFINCS-JAX generation contract

same-grid finite-beta SFINCS-JAX input generation, six-point completed HDF5 ladder ingestion including the profile-current stress neighborhood, exact radial interpolation, PAS nuD bridge, coefficient-level NTX comparison, a 35 x 43 x 48 production stress-radius resolution/harmonic-cutoff probe, a completed six-point production radial/collisionality coefficient ladder, and an accepted high-Nxi RHSMode=1 pitch stress gap below 1.5e-1 are artifact-backed

Owned finite-beta Redl and NTX+NEOPAX bootstrap-current stress

same finite-beta VMEC wout, Boozer transform, normalized-radius B00(rho) field convention, analytic profile contract, production radial/collisionality ladder, adaptive physical nu/v support, D33_spitzer audit branch, Sonine-order convergence sidecar, coefficient/profile localization sidecar, profile-current observable sidecar, current-conditioning sidecar, closure-quadrature sidecar, source-channel sidecar, profile source-response sidecar, closure-target driver sidecar, and production SFINCS-JAX coefficient ladder sidecar are artifact-backed; this is closed as a reduced-closure stress benchmark with the current high-order source response classified as mixed density/electric and temperature-gradient physics, not as a broad full-collision parity claim

Planned lanes are not release blockers. They stay visible so future work has clear promotion criteria instead of drifting into unsupported claims:

Lane

Required Before Promotion

Full monoenergetic geometry-family reproduction

production-resolution independent-code parity for the available W7-X EIM/EJM, QI, QA/QH, and stellarator-family inputs; owned W7-X KJM input; radial/electric-field/collisionality ladders

Larger geometry-control autodiff

broaden the current analytic and file-backed audits into reusable geometry families; add direct autodiff, implicit-adjoint, and finite-difference agreement on that basis

Hidden-symmetry and omnigenous families

owned input families and convergence gates before adding research-grade figures

QI and piecewise-omnigenous low-bootstrap families

owned input families; D11, D31, D33, reduced bootstrap-current response, and radial-profile convergence; comparison to published qualitative ordering before any design claim

Implicit-equilibrium sensitivity transfer

restore only after the backend residual solve contracts and Boozer/NTX transport observables match centered finite differences

Performance and memory crossover maps

repeat the production grid on additional GPU nodes and add device-memory timelines for larger VMEC-family workloads