SMA Research Platform

Evidence graph for Spinal Muscular Atrophy

Biology-first target discovery
Christian Fischer / Bryzant Labs
1,145Targets
453Trials
60Drugs
7Datasets
34,514Sources
43,071Claims
46,973Evidence
29,625Hypotheses
computationApr 27, 2026· SMA Research Platform

Three-backend GPU fleet live: NIM + Dell + DGX Spark GB10

#compute#benchmarks#gpu#spark#dell#nim#infrastructure#phase-2

Three-backend research compute fleet now live

The SMA research platform now operates three independent GPU compute backends in parallel, each running pharma-relevant workloads continuously:

Active backends (2026-04-27)

Backend Hardware Workload Status
NIM API (cloud) NVIDIA hosted Boltz-2 + ESMfold + MolMIM saturator 162 calls/5min, 86% OK
Dell Demo Center (free 90-day) RTX Pro 6000 Blackwell 96 GB Boltz-2 PPI saturator 4,200+ pair predictions, 82% GPU util
NVIDIA DGX Spark GB10 (owned) Grace Blackwell 128 GB unified, sm_121 Chai-1 ligand saturator + Qwen 35B local LLM 21+ predictions completed, ~33/hour throughput

Hardware comparison — measured benchmarks

Same workloads run on each backend produced apples-to-apples timing data, now published at /infrastructure/gpu-benchmark:

Workload Spark GB10 (48 SMs) Dell RTX Pro 6000 (188 SMs)
LLM tokens/sec (Qwen 35B Q8) 49.8 t/s 186.6 t/s (3.7× faster)
Matmul BF16 8192² 99.9 TFLOPS 395.4 TFLOPS (4.0×)
Memory ceiling 122 GB unified 96 GB GDDR

Trade-off: Dell wins compute throughput per second; Spark wins memory capacity for larger models (e.g. DeepSeek V4-Flash 140 GB MoE doesn't fit Dell's 96 GB).

Pharma-grade methodology decisions

  • Boltz-2 vs Chai-1 dual deployment: Boltz-2 hangs on Spark sm_121 with PyTorch 2.11+cu130 (Lightning Predict phase blocked); Chai-1 0.6.1 works natively on the same hardware. Spark gets Chai-1, Dell stays primary Boltz-2 — both peer-reviewed methodologies provide complementary scoring.
  • BindCraft dual-mode: Spark runs design + AF2-confidence filter (Bennett 2023 methodology); Dell runs full BindCraft incl. PyRosetta scoring (no aarch64 wheel exists for PyRosetta).
  • RFdiffusion deprecated in favor of BindCraft 1.5 (Pacesa Nature 2025, 10–100% binder success vs RFdiffusion 1–10%).

Storage architecture (Phase 2 complete)

Canonical research data root migrated from moltbot (constrained) to Spark /data/research-data/ (3.6 TB capacity, 215 GB now hosted). Dropbox cloud mirror remains automated. Daily migrator pulls fleet-results staging into the canonical tree.

Next milestones

  • 4th backend evaluation: Modal.com $30/mo free serverless tier as Dell-post-2026-07-22 successor
  • TurboQuant KV-cache compression (Google ICLR 2026) integration when llama.cpp PR lands
  • Cross-backend orthogonal validation pipeline (3-LLM consensus rule for any external claim)

All numbers are measured runs, not vendor specs. Source code + raw benchmark data: /api/v2/infrastructure/gpu-roi.

Login → Command Center