computationApr 27, 2026· SMA Research Platform

Three-backend GPU fleet live: NIM + Dell + DGX Spark GB10

#compute#benchmarks#gpu#spark#dell#nim#infrastructure#phase-2

Three-backend research compute fleet now live

The SMA research platform now operates three independent GPU compute backends in parallel, each running pharma-relevant workloads continuously:

Active backends (2026-04-27)

Backend	Hardware	Workload	Status
NIM API (cloud)	NVIDIA hosted	Boltz-2 + ESMfold + MolMIM saturator	162 calls/5min, 86% OK
Dell Demo Center (free 90-day)	RTX Pro 6000 Blackwell 96 GB	Boltz-2 PPI saturator	4,200+ pair predictions, 82% GPU util
NVIDIA DGX Spark GB10 (owned)	Grace Blackwell 128 GB unified, sm_121	Chai-1 ligand saturator + Qwen 35B local LLM	21+ predictions completed, ~33/hour throughput

Hardware comparison — measured benchmarks

Same workloads run on each backend produced apples-to-apples timing data, now published at /infrastructure/gpu-benchmark:

Workload	Spark GB10 (48 SMs)	Dell RTX Pro 6000 (188 SMs)
LLM tokens/sec (Qwen 35B Q8)	49.8 t/s	186.6 t/s (3.7× faster)
Matmul BF16 8192²	99.9 TFLOPS	395.4 TFLOPS (4.0×)
Memory ceiling	122 GB unified	96 GB GDDR

Trade-off: Dell wins compute throughput per second; Spark wins memory capacity for larger models (e.g. DeepSeek V4-Flash 140 GB MoE doesn't fit Dell's 96 GB).

Pharma-grade methodology decisions

Boltz-2 vs Chai-1 dual deployment: Boltz-2 hangs on Spark sm_121 with PyTorch 2.11+cu130 (Lightning Predict phase blocked); Chai-1 0.6.1 works natively on the same hardware. Spark gets Chai-1, Dell stays primary Boltz-2 — both peer-reviewed methodologies provide complementary scoring.
BindCraft dual-mode: Spark runs design + AF2-confidence filter (Bennett 2023 methodology); Dell runs full BindCraft incl. PyRosetta scoring (no aarch64 wheel exists for PyRosetta).
RFdiffusion deprecated in favor of BindCraft 1.5 (Pacesa Nature 2025, 10–100% binder success vs RFdiffusion 1–10%).

Storage architecture (Phase 2 complete)

Canonical research data root migrated from moltbot (constrained) to Spark /data/research-data/ (3.6 TB capacity, 215 GB now hosted). Dropbox cloud mirror remains automated. Daily migrator pulls fleet-results staging into the canonical tree.

Next milestones

4th backend evaluation: Modal.com $30/mo free serverless tier as Dell-post-2026-07-22 successor
TurboQuant KV-cache compression (Google ICLR 2026) integration when llama.cpp PR lands
Cross-backend orthogonal validation pipeline (3-LLM consensus rule for any external claim)

All numbers are measured runs, not vendor specs. Source code + raw benchmark data: /api/v2/infrastructure/gpu-roi.

Dig deeper on the platform