Benchmarks
Benchmark reporting is meant to show discipline, not benchmark puffery. Public numbers are directional and should be reproduced on your own hardware and workload before any decision.
Public benchmark snapshot
5 canonical scenescontact, constraints, vehicles, cloth, and fluid lanes
20% regression gatecurrent conservative CI threshold
CSV / JSON / HTMLartifact trail published for review
Reference machine notedi7-12700K + RTX 3080 in the package overview
Canonical scene sweep
| Scene | Scale | Avg step | Max step | Why it matters |
|---|---|---|---|---|
| Sphere Stack (10K) | 10,000 bodies | 5.234 ms | 6.123 ms | contact stress |
| Ragdoll Stack | 100 ragdolls / 500 bodies | 3.891 ms | 4.234 ms | constraint complexity |
| Vehicle Scene | 10 vehicles / 50 bodies | 2.456 ms | 2.890 ms | mixed constraints + friction |
| Cloth + Wind | 500–1000 particles | 0.500 ms | 0.700 ms | soft-body deformation lane |
| Fluid Spray | 10,000+ particles | 1.200 ms | 1.500 ms | particle persistence |
Public figures are taken from the reference package snapshot and should be re-run on the target buyer workload before any decision.
Artifact trail
- Baseline file:
proof/EXPECTED_OUTPUTS/world_step_benchmark_baseline.json - Trend history:
proof/RUNS/world_step_benchmark_history.csv - Trend snapshots:
proof/RUNS/world_step_benchmark_trend_<timestamp>.json - Current policy: conservative 20% regression threshold until CI variance is better characterized
How the page should be read
- What is measured: world-step cost, scene scaling, and subsystem behavior across named workloads.
- What is not claimed: universal performance leadership across every machine, build mode, and scene type.
- Why this matters: the key buyer signal is repeatability, named scenes, stored artifacts, and a visible regression policy.