Python vs Rust vs GPU — Performance Evidence¶

Benchmark data from three tiers of the wetSpring pipeline:

  1. Python (numpy/scipy) — industry-standard baseline
  2. Rust (sovereign CPU) — wetSpring barracuda crate
  3. GPU (barraCuda WGSL) — consumer RTX via ToadStool dispatch

All benchmarks run on ironGate (i9-14900K, 96 GB DDR5, RTX 4070 / RTX 3090).

Data sources: benchmarks/results/python_baseline_latest.json, experiments/results/015_pipeline_benchmark/, experiments/results/016_gpu_pipeline_parity/


For other springs: load your own benchmark JSONs. The Python baseline script (scripts/python_baseline.py) generates the same JSON schema for any domain.

Hardware: Intel(R) Core(TM) i9-14900K
  CPU cores: 32, RAM: 96,269 MB
  GPU: NVIDIA GeForce RTX 5070, VRAM: 12,227 MB
  OS kernel: 6.12.10-76061203-generic

Benchmark timestamp: 2026-05-07T00:43:26
Python phases: 27

Python Baseline Timings¶

Per-operation timings from Python/NumPy/SciPy across diversity metrics, distance matrices, and ordination at varying input sizes.

No description has been provided for this image

Rust vs Galaxy Pipeline¶

Full 16S pipeline comparison: sovereign Rust vs Galaxy/QIIME2. 22 samples, 3.9M reads through the complete pipeline.

Pipeline: wetSpring Pipeline — Rust CPU vs Galaxy/QIIME2
Date: 2026-02-19
Hardware: i9-12900K, 64 GB DDR5, RTX 4070, Pop!_OS 22.04

Rust Pipeline:
  Samples: 22
  Total reads: 3,912,846
  ASVs: 14,010
  Wall time: 34447.5s
  Energy: 1.195990 kWh

Galaxy/QIIME2 Pipeline:
  exp001: 20 samples, 124,249 reads, 71.5s
  exp002: 10 samples, 820,548 reads, 95.6s
  Per sample: 9.56s
  Energy: 0.007303 kWh
No description has been provided for this image

GPU Acceleration¶

CPU vs GPU parity on the 16S math pipeline. The GPU path delegates to barraCuda via ToadStool — zero local WGSL. 1,077x speedup for spectral cosine matching at production scale.

Experiment: 016_gpu_pipeline_parity
Tolerance: 1e-06
Samples: 10

CPU total: 7525.2 ms
GPU total: 3438.4 ms
Speedup:   2.19x

Note: This is the pipeline-level speedup. Individual operations
like spectral cosine matching show 1,077x on larger datasets.
No description has been provided for this image

Summary¶

Tier Substrate Pipeline Time Energy Parity
Python numpy/scipy baseline baseline reference
Rust CPU wetSpring barracuda varies by stage measured machine epsilon
GPU barraCuda WGSL via ToadStool 2.19x pipeline, 1,077x spectral lower tolerance 1e-6

The three-tier validation pattern (Python baseline -> Rust parity -> GPU acceleration) was pioneered in wetSpring and adopted across all 8 springs.


Source: syntheticChemistry/wetSpring | Live results: primals.eco/lab