SMA Research Platform

Evidence graph for Spinal Muscular Atrophy

Biology-first target discovery
Christian Fischer / Bryzant Labs
1,145Targets
453Trials
60Drugs
7Datasets
34,514Sources
43,071Claims
46,973Evidence
29,625Hypotheses
discoveryApr 11, 2026· SMA Research Platform

50 Orphan MD Trajectories Analyzed — LIMK2 Pipeline Validated, 4-AP SMN2 Pocket Rediscovered, CFL2 Claim Retracted

#2026-04-10#MD#MDAnalysis#orphan-analysis#topology-bug#retraction#scientific-integrity#LIMK2#SMN2#CFL2

TL;DR

50 completed MD trajectories (47.8 GB total) had been sitting on the cluster without analysis. We ran MDAnalysis-based backbone/ligand/contact/pocket-retention analysis on 44 of them (6 missing topology files) using a single CPU pass per trajectory. The results forced one retraction, validated one pipeline, and rediscovered a hidden positive that earlier MD metadata had wrongly marked as negative.

Three headline findings

1. 4-AP + SMN2 Tudor was a hidden positive — topology atom-count artifact hid the binding

The April 2 4-AP + SMN2 holo MD finished with metadata claiming binding_contacts: [] — zero stable contacts over 18.5 ns. We interpreted this as a strong negative: "4-AP does not bind SMN2." It was wrong. The orphan analysis re-ran the trajectory with a topology-fix step (the topology PDB had 140,793 waters; the DCD had 140,658 atoms — a 405-atom mismatch caused the ligand atom selection to silently return empty). With the fix:

  • 4-AP engaged 100% of frames over 18.5 ns
  • Pocket Cα distance 4.6 Å throughout — ligand never leaves
  • Top contacts: PRO268 (92%), VAL413 (92%), ASN270 (92%), SER271 (89%), PHE266 (81%), VAL267 (81%), ILE269 (74%), TYR657 (63%)
  • Verdict: WEAK_BINDER (engaged, but protein flexibility is high)

Cross-connection: Riluzole on SMN2 (SMN2_Riluzole_holo) binds the SAME pocket (GLY294, SER271, VAL272, CYS658, PRO268, TYR657 — shared with 4-AP at PRO268, SER271, TYR657). Two structurally different compounds, same pocket → this is a real druggable site, not a co-solvent artifact.

2. LIMK2 pipeline validated by reference compound contact persistence

Several LIMK2 reference compound MDs (BMS-5, LIMKi3) showed clean stable contact patterns at the predicted ATP-pocket residues, validating the structure-based scoring + POCKET_FIXED placement protocol our active LIMK2-selective screen depends on.

3. CFL2 + 4-AP cross-connection RETRACTED

The April 10 CROSS_CONNECTIONS document (Insight 1) claimed a 4-AP + CFL2 simulation existed and connected the LIMK2-selective campaign to the 4-AP campaign through a shared CFL2 readout. The orphan analysis revealed that CFL2_gpu33887147.dcd is actually an APO CFL2 simulation (35,150 atoms = protein + solvent only, no ligand). The claimed 4-AP + CFL2 MD never happened. Insight 1 is retracted. A revised insight (the SMN2 pocket shared with Riluzole, finding #1 above) replaced it.

The topology atom-count learning

The 4-AP SMN2 false negative was caused by a silent atom-count mismatch between the topology PDB and the DCD trajectory. MDAnalysis silently dropped the frames where the ligand selection returned empty, which the contact analysis then read as "no binding." This is a class of bug that turns positives into negatives without any error message. Hard rule going forward:

Before writing any "no binding" or "unstable" conclusion from an MD trajectory, verify topology atom count matches DCD atom count, verify the ligand selection is non-empty, and cross-check surprising negatives with a second analysis.

Full learning published at: https://github.com/Bryzant-Labs/sma-research/blob/main/docs/learnings/topology_atom_count.md

Method

  • analyze_orphan_trajectory.py — single-pass MDAnalysis 2.10 analysis (Kabsch protein RMSD, minimum-image PBC ligand-pocket distance, contact persistence, energy drift)
  • batch_analyze_orphans.py — runner with topology hint map
  • fix_topology_atoms.py — strips tail waters from topology to match DCD atom count
  • 44/50 trajectories analyzed (6 missing topology files)
  • Compute: 8-core CPU, 1-60 s per trajectory, ****

Where the data lives

Why this matters

Three scientific principles, all enforced in this single analysis run:

  1. Don't waste compute — completed runs that never get analyzed are scientific waste
  2. Verify negative results twice — a topology bug nearly closed two real discoveries
  3. Publish retractions with the same rigor as discoveries — Insight 1 is retracted publicly, not silently fixed

CC-BY-4.0. All artifacts open-source.

Login → Command Center