Loading stats...

Accelerating SMA Drug Discovery Through Computational Science

We combine molecular screening, evidence analysis, and computational biology to identify and validate new therapeutic candidates for SMA — featuring our ROCK-LIMK2-CFL2 axis discovery across 5/6 independent research streams

Search the Evidence Graph
ROCK-LIMK2-CFL2 axis Nusinersen vs Risdiplam SMN-independent targets Actin rod pathway
Computational Discovery Pipeline
Literature
PubMed · bioRxiv · Patents
Evidence Extraction
LLM claim mining
Target Scoring
8-dimension ranking
Hypotheses
Falsifiable · Tier A/B/C
Virtual Screening
DiffDock v2.2
Drug Design
GenMol · RFdiffusion
Experiment Proposal
Go / No-Go criteria
Drug Screening Funnel

Latest Discoveries

View All News →

Research Directions

EXPLORATORY16 active directions

Research directions under active exploration — from spatial multi-omics to engineered probiotics. Each direction connects to specific molecular targets and therapeutic modalities.

Experimental Validation Plan

Priority experiments to validate our computational discoveries. Each phase has explicit go/no-go gates.

Priority 1 — Go/No-Go Gate
Riluzole → SMN2 Binding (SPR) + CORO1C Expression After HDAC Inhibition (qRT-PCR)
SPR for riluzole-SMN2 interaction (only validated hit from 4,116 screen). HDAC inhibitor → CORO1C expression measurement to test epigenetic activation of protective modifier.
Riluzole: FDA-approved, Phase 1 data exists | CORO1C: expression enhancement strategy
Priority 2 — Parallel Track
Riluzole → SMN2 Binding Validation
SPR for riluzole-SMN2 interaction. FDA-approved ALS drug with prior SMA Phase 1 data (PMID: 14623733).
Clinical repurposing path if confirmed
Priority 3 — Tissue Analysis
CORO1C Expression in SMA Mouse (L1 vs L5)
IHC staining of CORO1C in SMA mouse spinal cord. Compare vulnerable (L1) vs resistant (L5) segments.
Collaboration opportunity
Priority 4 — Computational (Done)
Orthogonal Docking Consensus
DONE — Vina consensus confirmed 4-AP/CORO1C binding. 20-pose validation: only riluzole confirmed as reproducible hit.
56 initial hits → 1 validated (riluzole)
Phase 1 (Month 1–3): Computational cross-validation + SPR binding
Phase 2 (Month 3–6): iPSC-MN functional studies
Phase 3 (Month 6–12): SMA mouse model validation
Full validation plan on GitHub — 12 experiments, 5 discoveries, grant opportunities
A
Calibration Grade
89.8% — 227 outcomes
4,116
DiffDock Dockings
630 compounds × 7 targets
15
ESM-2 Embeddings
Similarity matrix + contacts
9/9
Variant Predictions
SMN1 mutations correct
Frontier Approaches
Spatial Multi-Omics
"Google Maps of Muscle"
Slide-seq and MERFISH spatial transcriptomics to map motor neuron vulnerability at single-cell resolution. Identify which cells die first and why.
SMN1 STMN2 NMJ LIVE
NMJ-on-a-Chip
Retrograde muscle-to-nerve signaling
Microfluidic neuromuscular junction models to study retrograde signaling from muscle to nerve. Test whether muscle-derived factors can rescue motor neurons.
NMJ PLS3 ECM LIVE
Bioelectric Reprogramming
Michael Levin, Vmem manipulation
Membrane voltage (Vmem) manipulation to reprogram cell fate. Levin Lab showed bioelectric patterns control regeneration and morphogenesis.
mTOR CD44 LIVE
Epigenetic Dimming
dCas9/CRISPRi without DNA cuts
Dead Cas9 (dCas9) fused to epigenetic modifiers to silence disease-promoting genes without making permanent DNA breaks. Reversible gene regulation.
DNMT3B SMN2 LIVE
DUBTACs
Protein stabilization via deubiquitination
Deubiquitinase-targeting chimeras to stabilize SMN protein by preventing its degradation. The inverse of PROTACs — protect instead of destroy.
SMN Protein UBA1 NEDD4L Exploring
Cross-Species & Evolutionary
Bear Hibernation
Muscle preservation during torpor
Bears maintain muscle mass during months of immobility. Understanding their anti-atrophy mechanisms (NEDD4L suppression, amino acid recycling) could translate to SMA.
NEDD4L mTOR CAST Exploring
NDRG1 / Cell Dormancy
Zebrafish atrofish model
NDRG1 enables cells to enter protective dormancy. The zebrafish atrofish model shows muscle wasting similar to SMA — dormancy pathways may rescue dying motor neurons.
SPATA18 LDHA Exploring
Cross-Species Regeneration
c-Fos/JunB molecular switch
Axolotls and zebrafish regenerate motor neurons via a c-Fos/JunB transcriptional switch. Reactivating these pathways in human motor neurons could promote repair.
STMN2 ANK3 LIVE
Naked Mole Rat
HMM-HA cytoprotection via CD44
Naked mole rats produce high-molecular-mass hyaluronic acid (HMM-HA) that signals through CD44 for extraordinary cytoprotection. Could this shield motor neurons?
CD44 SULF1 CTNNA1 Exploring
Disease Biology
SMA Multisystem
Liver metabolism, Lee Rubin Harvard
SMA is not just a motor neuron disease. Liver metabolic defects, fatty acid oxidation disruption, and pancreatic dysfunction contribute to pathology beyond the spinal cord.
LDHA SMN Protein mTOR LIVE
ECM Engineering
Fibrosis reversal, NMJ stability
Extracellular matrix remodeling drives fibrosis at the neuromuscular junction. Engineering the ECM microenvironment could restore NMJ stability and synaptic transmission.
SULF1 GALNT6 NMJ LIVE
Cross-Disease Learning
ALS, DMD, SBMA shared pathways
ALS, DMD, and SBMA share motor neuron and muscle pathology with SMA. Breakthroughs in one disease may accelerate drug discovery in others through shared molecular targets.
STMN2 UBA1 NCALD Exploring
Unconventional
RNA Decoy / Sponge
hnRNP A1 sequestration
Engineered RNA decoys to sequester hnRNP A1, the splicing repressor that causes SMN2 exon 7 skipping. Soak up the enemy to let SMN2 produce full-length protein.
SMN2 SMN1 LIVE
Mitochondrial Overdrive
PGC-1alpha bioenergetic rescue
PGC-1alpha activation to boost mitochondrial biogenesis and rescue the bioenergetic deficit in SMA motor neurons. Address the energy crisis driving cell death.
SPATA18 LDHA mTOR Exploring
Engineered Probiotics
Gut-brain axis, butyrate HDAC
Engineered probiotic bacteria producing butyrate (HDAC inhibitor) to increase SMN2 expression via the gut-brain axis. Oral, non-invasive, continuous delivery.
SMN2 DNMT3B LY96 Exploring
Mechanotransduction
Vibration-activated HSP
Low-frequency mechanical vibration activates heat shock proteins (HSPs) that stabilize misfolded proteins and protect cells. Non-pharmacological intervention for muscle preservation.
CORO1C CTNNA1 PLS3 Exploring
Warp-Speed Vision
"GitHub for Life"
Gene edit versioning + biological embeddings
Treating DNA sequences like code. Every SMN2 splice variant gets a commit hash. ESM-2 and ProtT5 protein language models predict how mutations cascade through protein folding.
SMN2 SpliceAI ESM-2 LIVE
Agentic Research Swarm
Blackboard architecture, autonomous discovery
A swarm of AI agents: bioRxiv scanner, molecule screener, simulation coder, hypothesis generator. They communicate via a blackboard architecture, compressing years of research into weeks.
bioRxiv ChEMBL Claude LIVE
Digital Twin: Motor Neuron
In silico drug screening at scale
A computational model of the SMA motor neuron metabolism. Test 1 million drug combinations in silico per night. Only the top 3 go to the real lab. Engineering, not lottery.
GEO STRING Proteomics LIVE
OpenSMA-Engine
Open-source datasets + models on HuggingFace
Publishing cleaned SMA datasets, fine-tuned protein models, splice variant benchmarks, and drug-likeness filters as open-source community resources on HuggingFace and GitHub.
HuggingFace RDKit ProtT5 LIVE

Targets

VALIDATED DATA

Genes, proteins, and pathways implicated in SMA pathogenesis, scored across multiple evidence dimensions. The platform tracks 58 molecular targets across three tiers: Primary targets — SMN1 and SMN2, the causal genes of SMA where loss of full-length SMN protein drives motor neuron degeneration. Established modifier targets — STMN2 (axonal maintenance), PLS3 and NCALD (natural protective modifiers identified in asymptomatic SMN1-deletion carriers), UBA1 (ubiquitin pathway), and CORO1C (actin dynamics). Discovery targets — recently identified through multi-omics convergence analysis and cross-disease research, including PFN1 (profilin, SMA-ALS convergence), CFL2 (cofilin, actin rod formation), ROCK2 (Rho-kinase, druggable with fasudil), and TP53 (p53-mediated motor neuron death). Each target is scored on evidence depth, source diversity, druggability, and clinical validation.

How does Target convergence scoring work?

Each target scored across 5 independent evidence dimensions:

  • Claim Volume — raw evidence mass (distinct assertions)
  • Lab Independence — unique research groups (guards against single-lab bias)
  • Method Diversity — in vitro, animal, patient, computational (cross-validation)
  • Temporal Trend — growing, stable, or declining evidence over recent years
  • Replication — independently confirmed findings across studies and models

Composite score: weighted average (0–100), weights fully transparent.

Tier assignment: Top 5 = Tier A (high-conviction), 6–15 = Tier B, rest = Tier C.

Bayesian calibration: scores validated against known drug outcomes (Grade A: 89.8% concordance).

Source: convergence_engine.py

SymbolNameTypeIdentifiersDescription
Loading targets...

Clinical Trials

VALIDATED DATA

SMA clinical trials aggregated from ClinicalTrials.gov via the v2 API with automated daily refresh. Covers all interventional and observational studies related to spinal muscular atrophy — from early Phase 1 safety trials through Phase 3 efficacy studies and post-marketing surveillance. Each trial entry includes NCT identifier, phase, enrollment, status, intervention type, primary/secondary outcome measures, and where available, published results with adverse events and participant flow data. Use the filters to explore by phase, status, intervention type, or keyword.

NCT IDTitlePhaseStatusSponsorN
Loading trials...

Drugs & Therapies

VALIDATED DATA

Approved SMA therapies and pipeline candidates tracked with mechanism of action, clinical status, and computational screening data. Three FDA/EMA-approved treatments target the SMN pathway directly: nusinersen (antisense oligonucleotide, intrathecal), risdiplam (small molecule splicing modifier, oral), and onasemnogene abeparvovec (AAV9 gene therapy, one-time IV). Pipeline drugs explore complementary approaches including muscle-enhancing (apitegromab, anti-myostatin), neuroprotective (fasudil, ROCK inhibition), HDAC-mediated SMN2 upregulation (panobinostat, vorinostat), and pathway-corrective strategies targeting actin dynamics, p53-mediated apoptosis, and axonal transport. Each drug entry links to DiffDock virtual screening results where available.

NameBrandTypeStatusMechanism
Loading drugs...

Literature

VALIDATED DATA

PubMed papers and patent literature ingested via automated daily pipeline, each scanned by AI (Gemini Flash with Groq fallback) for structured claims about SMA biology, molecular targets, and therapeutic approaches. The ingestion pipeline runs daily at 03:00 UTC, querying PubMed, bioRxiv/medRxiv preprints, ClinicalTrials.gov, and Google Patents. Each abstract passes a two-layer quality filter: first an SMA-relevance gate (must mention SMA, SMN, motor neuron, or approved therapy names), then a post-extraction quality gate that rejects claims about unrelated diseases. Sources are linked to their extracted claims — use the "With claims" filter to see which papers have been processed.

How does Paper Quality scoring work?

Two-layer quality assessment: Every paper receives both a quantitative score (automated metrics) and a qualitative red-flag analysis (AI-powered logic checks).

Layer 1 — Quantitative Score (0-100): Computed automatically via OpenAlex and Crossref APIs.

  • Citation impact (30 pts) — log-scaled, capped at 500 citations
  • Author h-index (20 pts) — best h-index from first/last author
  • Journal presence (15 pts) — peer-reviewed journal = 15, preprint = 5
  • Recency (15 pts) — papers <3 years = full, decays over time
  • Collaboration (10 pts) — multi-author studies score higher
  • Retraction check — retracted papers score 0 (via Crossref)

Layer 2 — Qualitative Red Flags: Each abstract is analyzed by both rule-based pattern matching and an LLM (Gemini Flash) for scientific quality issues.

  • Logical contradictions — conclusion directly contradicts the presented data (e.g. claims muscle-specific effect but measures only spinal cord)
  • Vehicle/control improvement — control group shows same improvement as treatment, undermining the drug effect
  • No blinding — only flagged for drug intervention studies, not basic neuroscience
  • Dose-response paradox — higher doses show less effect without explanation
  • Small sample size — fewer than 5 subjects per experimental group
  • No independent replication — strong therapeutic claims >10 years old with no clinical follow-up
  • Overselling language — words like 'dramatic', 'remarkable' when actual numbers are modest

Composite Score:

Penalty = (High-severity flags x 15) + (Medium x 5) + (Low x 2)
Adjusted Score = max(5, Quantitative Score - Penalty)

Calibration principles:

  • Basic science (electrophysiology, histology, imaging) does NOT require blinding — only drug studies
  • Normal scientific reasoning is NOT a logic flag — only clear contradictions
  • Review papers should have very few flags
  • Conservative flagging: when in doubt, do not flag

Quality scoring methodology is fully open source: paper_red_flags.py | source_quality.py

PMIDTitleJournalDateClaims
Loading sources...

Omics Datasets

VALIDATED DATA

Curated omics datasets for SMA research. Tier 1 datasets are directly usable for motor neuron vulnerability analysis; Tier 2-3 require additional QC or serve as validation.

AccessionTitleModalityOrganismTissueTier
Loading datasets...

Extracted Claims

VALIDATED DATA

Structured scientific assertions extracted from paper abstracts using multi-LLM analysis with rigorous quality filtering. Each claim is a single factual statement that preserves the original authors' hedging language (e.g., "may regulate" stays "may regulate" — never upgraded to definitive). Claims are typed into 12 categories (gene expression, protein interaction, drug efficacy, splicing event, biomarker, etc.), scored for confidence (0–100%), and linked to both their source paper and relevant molecular targets via 200+ alias patterns. The extraction pipeline uses a two-layer quality gate: disease-relevance filtering removes non-SMA contamination, and word-boundary matching prevents false target links. Click any row to see the full provenance chain: paper title, PubMed ID, abstract excerpt, extraction model, and metadata.

How does Claim extraction and scoring work?

Extraction: Each paper abstract is analyzed by Gemini Flash (with Groq fallback) to extract structured claims. Claims preserve original hedging language (“may regulate” stays “may regulate”).

12 claim types: gene_expression, protein_interaction, pathway_membership, drug_target, drug_efficacy, biomarker, splicing_event, neuroprotection, motor_function, survival, safety, other.

Confidence scoring (0–100%) — 6 dimensions weighted:

  • Specificity (15%) — detail level of the predicate
  • Named entities (15%) — genes, measurements, p-values mentioned
  • Type consistency (20%) — claim type matches predicate keywords
  • Evidence strength (25%) — excerpt, method, p-value, effect size, sample size present
  • Source attribution (10%) — PMID + journal present
  • Replication (15%) — same finding from multiple independent sources (≥3 = 1.0)

Quality gates: SMA-relevance filter + disease-contamination check + word-boundary target matching (265+ alias patterns).

Source: claim_quality.py and claim_extractor.py

ClaimSource PaperTypeConfidenceTargets
Loading claims...

Hypothesis Prioritization

HYPOTHESIS

Phase 2: Multi-criteria ranked hypotheses scored across evidence depth, source convergence, therapeutic clarity, target strength, and novelty. Tier A = top 5 high-conviction, Tier B = medium priority, Tier C = needs more evidence.

How does Hypothesis scoring work?

5-dimension scoring:

  • Evidence depth (25%) — claim count and LLM confidence
  • Source convergence (20%) — independent papers supporting the hypothesis
  • Therapeutic clarity (20%) — clear drug modality suggestion
  • Target strength (20%) — parent target’s composite convergence score
  • Novelty (15%) — emerging vs well-trodden research angles

Tier assignment: Top 5 = Tier A (ready for computational drug design), 6–15 = Tier B (need more evidence), rest = Tier C.

Generated by Claude Sonnet with evidence grounding — every hypothesis links to specific claims and source papers.

Source: hypothesis_generator.py

Loading hypotheses...

Prediction Cards

HYPOTHESIS

Evidence-grounded, falsifiable predictions generated from convergence scoring across 5 dimensions: Volume, Lab Independence, Method Diversity, Temporal Trend, and Replication. All scoring weights are transparent methodology. Each card links every claim to its source paper.

Loading prediction cards...

ROCK-LIMK-Cofilin Pathway

INTERACTIVE 5 / 6 research streams

Interactive visualization of the ROCK-LIMK2-CFL2 therapeutic axis — the platform’s highest-confidence mechanistic finding. SMN protein loss triggers massive cytoskeletal stress: ROCK1/2 UP, LIMK2 massively UP (+2.81x), CFL2 compensatory UP (+1.83x), 10/14 actin genes upregulated. Toggle between SMA and ALS views to see the striking disease-specific differences: SMA uses LIMK2, ALS uses LIMK1; CFL2 is UP in SMA but DOWN in ALS. Fasudil (ROCK inhibitor) is already validated in SMA mice (Bowerman 2012). Click any node for evidence details.

UP DOWN Unchanged Upstream Activation Inhibition Predicted
Loading pathway visualization...
Evidence Convergence: 5 / 6 Streams
✅ GEO omics — 3 datasets (GSE69175, GSE108094, GSE208629), 3 independent labs
✅ DiffDock docking — ROCK2, MAPK14, LIMK1, LIMK2
✅ Cross-paper synthesis — 12+ independent publications
✅ Cross-disease ALS/SMA — PFN1/PFN2 convergence, LIMK1 vs LIMK2
✅ Digital twin — actin dynamics compartment modelled
⚪ Wet-lab validation — pending

Evidence Convergence

COMPUTATIONAL

Multi-dimensional evidence convergence scoring across thousands of curated, quality-filtered claims extracted from 6,400+ PubMed sources. Each of the 58 molecular targets is scored across five independent dimensions: Claim Volume (raw evidence mass — how many distinct assertions support this target), Lab Independence (number of unique research groups reporting findings — guards against single-lab bias), Method Diversity (range of experimental approaches: in vitro, animal model, patient data, computational — cross-validated findings score higher), Temporal Trend (whether evidence is growing, stable, or declining over recent years — captures scientific momentum), and Replication (how often key findings have been independently confirmed across different studies and model systems). Scores are weighted and combined into a composite convergence score (0–100). All weights and methodology are fully transparent and transparent methodology. The engine generates falsifiable predictions grounded in evidence — each prediction card links every supporting claim back to its source paper for full traceability.

Loading convergence scores...

Prediction Cards

Loading predictions...

Evidence Calibration

COMPUTATIONAL

Bayesian back-testing of convergence scores against known drug outcomes — the critical self-check that separates rigorous research from speculation. For each drug with a known clinical outcome (approved, failed in Phase 2/3, or preclinical only), the platform asks: did our evidence scoring predict the right outcome? The calibration process works as follows: (1) Outcome collection — gather real-world drug approval/failure data from ClinicalTrials.gov and FDA records for all 21 tracked drugs. (2) Score comparison — compare each drug's convergence score against its actual clinical outcome. Approved drugs (nusinersen, risdiplam, onasemnogene) should score high; failed drugs should score low. (3) Bayesian updating — use the comparison to compute posterior probabilities, measuring how well evidence mass predicts clinical success. (4) Grade assignment — the system earns a calibration grade (A–F) based on concordance between predicted and actual outcomes. Grade A (current: 89.8%) means the scoring reliably separates successful from unsuccessful therapeutic approaches. A well-calibrated platform means researchers can trust the convergence scores when evaluating novel, untested targets.

Loading calibration data...

Calibration Curve

Convergence score bins vs actual drug success rate. Perfect calibration = diagonal line.

Metrics

Uncertainty Quantification

Wilson score confidence intervals on target support ratios. Grades combine CI tightness, source diversity, and temporal stability. Green = high certainty, amber = moderate, red = uncertain.

Target Prioritization

COMPUTATIONAL

Multi-criteria scoring across 7 dimensions: evidence strength, biological coherence, fragility relevance, interventionability, translational feasibility, novelty, and contradiction risk. Composite score determines Phase 3 priority.

How does Target prioritization scoring work?

7 scoring dimensions:

  • Evidence strength — volume and quality of supporting claims
  • Biological coherence — pathway consistency and mechanistic logic
  • Fragility relevance — connection to SMN-dependent vulnerability
  • Interventionability — whether a therapeutic modality exists
  • Translational feasibility — path from bench to clinic
  • Novelty — unexplored vs saturated research space
  • Contradiction risk — conflicting evidence or known failures

Composite score determines Phase 3 priority for computational drug design.

Source: target_prioritizer.py

Loading scores...

Target Priority Engine v2

COMPUTATIONAL

Multi-criteria decision engine integrating 6 data dimensions: evidence convergence (25%), druggability via DiffDock screening (20%), ESM-2 structural uniqueness (15%), clinical validation from drug outcomes (15%), cross-species conservation (10%), and target novelty (15%).

How does Candidate ranking work?

6 data dimensions:

  • Evidence convergence (25%) — multi-source evidence strength
  • Druggability via DiffDock screening (20%) — virtual binding affinity
  • ESM-2 structural uniqueness (15%) — protein embedding distinctiveness
  • Clinical validation (15%) — existing drug outcomes for this target
  • Cross-species conservation (10%) — ortholog data across 7 model organisms
  • Target novelty (15%) — less-studied targets score higher

Composite = weighted sum, ranked for drug design priority.

Source: candidate_ranker.py

Loading priority engine...

Evidence Graph

COMPUTATIONAL

The evidence graph connects claims to their supporting sources. Each assertion is backed by traceable references (PMIDs, clinical trial results). Grouped by source paper, sorted by claim count.

Loading evidence graph...

About SMA

Frequently asked questions about Spinal Muscular Atrophy and this research platform.

What is Spinal Muscular Atrophy (SMA)?

Spinal Muscular Atrophy (SMA) is a genetic neuromuscular disease caused by homozygous deletion or mutation of the SMN1 gene on chromosome 5q13. This leads to loss of full-length SMN protein, causing progressive degeneration of motor neurons. It affects approximately 1 in 10,000 live births and is the most common genetic cause of infant death. The severity is primarily modified by the number of SMN2 gene copies a patient carries.

What approved treatments exist for SMA?

Three therapies are currently approved: Nusinersen (Spinraza) — an antisense oligonucleotide targeting SMN2 ISS-N1, administered intrathecally (approved 2016). Risdiplam (Evrysdi) — an oral small molecule SMN2 splicing modifier (approved 2020). Onasemnogene abeparvovec (Zolgensma) — a single-dose intravenous AAV9 gene replacement therapy delivering a functional SMN1 copy (approved 2019). None of these constitutes a cure.

What is the SMA Research Platform?

The SMA Research Platform is an evidence-first drug research platform that aggregates, structures, and prioritizes global SMA evidence automatically. It ingests data from PubMed, ClinicalTrials.gov, STRING-DB, and KEGG. It uses LLM-based claim extraction to identify thousands of structured claims from abstracts, scores 21 molecular targets across 7 dimensions, and prioritizes hundreds of hypotheses into action tiers for accelerating therapeutic development.

What are the key molecular targets for SMA?

The platform tracks 21 molecular targets in two tiers. 10 established targets with composite scores: SMN1, SMN2, SMN Protein, STMN2, mTOR Pathway, NMJ Maturation, UBA1, PLS3, NCALD, and CORO1C. 11 discovery targets identified via multi-omics convergence analysis (GEO datasets GSE69175, GSE108094, GSE208629): CD44 (cell adhesion), SULF1 (ECM remodeling), DNMT3B (epigenetics), ANK3 (axonal integrity), GALNT6 (glycosylation), LY96 (neuroinflammation), SPATA18 (mitochondrial QC), LDHA (metabolism), CAST (calpain inhibition), NEDD4L (ubiquitin pathway), and CTNNA1 (cytoskeleton).

How does the hypothesis prioritization work?

Hypotheses are scored across 5 dimensions: evidence depth (claim count and LLM confidence, 25% weight), source convergence (independent papers, 20%), therapeutic clarity (clear modality suggestion, 20%), target strength (parent target's composite score, 20%), and novelty (emerging vs well-trodden research angles, 15%). The top 5 are assigned Tier A (high-conviction, ready for computational drug design), ranks 6-15 get Tier B (need more evidence), and the rest get Tier C.

Drug Screening

COMPUTATIONAL

This pipeline computationally filters thousands of ChEMBL compounds down to the best candidates for SMA drug discovery. The process runs in six steps: (1) ChEMBL query — compounds bioactive against top-scored SMA targets are fetched with their SMILES strings; (2) RDKit descriptor calculation — molecular weight, LogP, rotatable bonds, H-bond donors/acceptors, TPSA, and QED are computed from SMILES; (3) Lipinski Rule of 5 — MW < 500, LogP < 5, HBD ≤ 5, HBA ≤ 10; compounds failing two or more rules are flagged as non-drug-like; (4) BBB permeability estimate — TPSA < 90 Ų and MW < 450 are used as a heuristic for blood-brain barrier crossing; (5) CNS MPO score — a 0–6 composite of LogP, LogD, MW, TPSA, HBD, and pKa tuned for CNS drug development; (6) PAINS filter — substructure alerts for pan-assay interference compounds that cause false positives in biochemical screens.

Why BBB penetration matters for SMA: SMA is caused by the loss of SMN protein in lower motor neurons located in the anterior horn of the spinal cord — a compartment behind the blood-brain barrier. Small-molecule therapeutics must cross this barrier to reach motor neurons. Risdiplam (approved 2020) succeeds partly because of its BBB-permeable profile; many otherwise potent compounds fail in SMA because they cannot access the CNS. Compounds with TPSA > 90 Ų or MW > 500 Da are unlikely to achieve meaningful CNS exposure via oral dosing.

Score glossary: Lipinski — binary pass/fail for oral bioavailability potential. BBB — heuristic estimate of CNS penetration (TPSA + MW). CNS MPO — 0–6 score; ≥ 4 is considered CNS-optimized. QED — 0–1 drug-likeness estimate combining eight Lipinski-adjacent properties; ≥ 0.5 is high quality. PAINS — substructure alert for reactive or promiscuous scaffolds that should be deprioritized.

How are screening compounds scored?

Drug-likeness metrics:

  • Lipinski Rule of 5 — binary pass/fail for oral bioavailability (MW ≤500, LogP ≤5, HBD ≤5, HBA ≤10)
  • BBB permeability — heuristic CNS penetration estimate (TPSA + MW). Critical for SMA: motor neurons are in the spinal cord
  • CNS MPO (0–6) — multi-parameter optimization score; ≥4 is CNS-optimized
  • QED (0–1) — Quantitative Estimate of Drug-likeness combining 8 properties; ≥0.5 = high quality
  • PAINS — substructure alert for reactive/promiscuous scaffolds (should be deprioritized)

DiffDock virtual screening: molecular docking using NVIDIA NIM. Confidence score indicates predicted binding affinity. 20-pose screening required for reliable results (5-pose has 46% false positive rate from MW bias).

Filters applied: MW ≥150 (excludes small-molecule artifacts), QED ≥0.3.

Source: docking_scorer.py, screening_funnel.py

Note: Drug-likeness predictions use rule-based heuristics (Lipinski Rule of 5, TPSA-based BBB estimate, QED score). These are filtering tools, not validated PK/tox models.

Loading screening data...

Top Candidates

ChEMBL IDStructureTargetMWLogPQEDCNS MPOBBBLipinskiPAINSpChEMBLSource
Loading...

Drug Repurposing

COMPUTATIONAL

Drug repurposing means finding new therapeutic uses for existing approved drugs — bypassing the 10–15 years and $1–2B typically required for de novo drug development. Repurposed drugs have already passed safety trials, so clinical translation is dramatically faster: Phase I is often skipped and Phase II can start in 2–3 years rather than 10+.

The platform identifies SMA repurposing candidates through three convergent strategies: (1) Cross-disease mining — drugs approved or in trials for related neuromuscular diseases (ALS, Duchenne Muscular Dystrophy, SBMA, CMT) that share molecular targets with SMA; (2) ChEMBL bioactivity — known compounds with high pChEMBL values (≥ 6.0, corresponding to IC₅₀ ≤ 1 µM) against top-scored SMA targets; (3) Pathway overlap — compounds whose known mechanism overlaps with the actin dynamics, NMJ signaling, or autophagy/survival pathways dysregulated in SMA.

Precedent in SMA: Valproic acid (VPA), originally an epilepsy drug, was one of the first compounds tested in SMA clinical trials — its HDAC inhibition was found to increase SMN2 splicing. Olesoxime (a cholesterol-oxime neuroprotective) reached Phase II. Riluzole (ALS-approved) showed modest motor neuron protection in SMA models. The platform extends this approach computationally, scoring each candidate 0–1 based on target relevance, potency, clinical phase, and pathway convergence. Click any row to see full rationale, mechanism, and target link.

Loading repurposing candidates...

Top Candidates

RankCompoundSMA TargetScoreSourcePhaseRationale
Loading...

Top Drug Candidates

COMPUTATIONAL

In drug discovery, a hit is a compound that shows measurable activity against a target of interest and passes initial computational filters. This section is the unified ranked list — the best compounds from all analysis pipelines: ChEMBL screening, cross-disease repurposing, and DiffDock virtual binding.

Each candidate passes through a 6-stage validation pipeline: (1) Computational — drug-likeness filters (Lipinski, QED, PAINS), BBB/CNS MPO scoring; (2) Structural — DiffDock pose prediction against SMA target binding pockets, confidence scoring; (3) Analog search — ChEMBL SAR analysis to identify structurally similar compounds with known SMA-relevant activity; (4) ADMET prediction — rule-based absorption, distribution, metabolism, excretion, and toxicity estimates; (5) Literature review — automated PubMed search for the compound + SMA target co-occurrence; (6) Experimental design — suggested assay types (SMN2 splicing reporter, NMJ morphology, motor neuron survival) for wet-lab validation.

Candidates are scored 0–1 (integrated score) and assigned a tier: Tier A (≥ 0.6) — strong multi-dimensional evidence, prioritized for experimental follow-up; Tier B (0.4–0.6) — moderate evidence, worth secondary screening; Tier C (< 0.4) — computational-only signal, lower priority. Click any row to see full molecular properties, BBB status, DiffDock score, validation stage, and target link.

Note: ADMET predictions use rule-based heuristics (Lipinski Rule of 5, TPSA-based BBB estimate, QED score, PAINS substructure filters). These are computational filtering tools, not validated pharmacokinetic or toxicology models.

Loading integrated candidates...

AI-Designed Drug Candidates

COMPUTATIONAL

De novo molecules generated by GenMol/MolMIM and SAR campaigns, validated with DiffDock docking against LIMK2 and ROCK2. Ranked by best DiffDock confidence score. Top hits: (S,S)-H-1152 (best LIMK2 dual-target) and genmol_119 (original hit, stereo-resolved).

0 selected
#CompoundTargetDiffDockQEDMWBBBMethod
Loading AI candidates...

Ranked Candidates

0 selected
#ChEMBL IDTargetScoreTierQEDBBBADMETpChEMBLFlags
Loading...

Compare Candidates

Screening Hits

COMPUTATIONAL

Positive binding predictions from AI-driven virtual screening. Each hit goes through a 6-stage validation pipeline: computational validation, structural analysis, analog search, ADMET prediction, literature review, and experimental design.

What this means for researchers These are the compounds that passed virtual screening with positive DiffDock confidence scores (> 0), meaning the AI predicts they will physically bind to SMA-relevant protein targets. Hits are ranked by confidence score — higher is better. The 6-stage pipeline tracks each hit from computational prediction through to experimental design suggestion. Green dots = completed stage, yellow = in progress, gray = pending.

Confidence score guide: > +0.5 = high-confidence binder (strong signal), +0.1 to +0.5 = moderate binder, 0 to +0.1 = marginal (needs validation). For reference, riluzole (ALS drug) scores +0.082 against LIMK2. Scores below 0 are filtered out and not shown here.

Note: ADMET predictions in the pipeline use rule-based heuristics (Lipinski, TPSA, PAINS), not validated PK/tox models.

Knowledge Graph

COMPUTATIONAL

Interactive network of SMA molecular targets connected by protein-protein interactions (STRING), shared pathways (KEGG/UniProt), and compound bioactivity (ChEMBL). Click a node to highlight its connections.

Loading knowledge graph...
Gene
Protein
Pathway
Other

Drug Outcome Database

VALIDATED DATA

Structured database of drug successes and failures in SMA research. Every outcome traces back to a source paper — capturing not just what worked, but why compounds failed (toxicity, bioavailability, efficacy).

Loading outcomes...
CompoundTargetOutcomePhaseFailure ReasonKey FindingSource

Cross-Species Comparative

COMPUTATIONAL

Cross-species conservation mapping of SMA-relevant molecular targets across 7 model organisms. Each organism offers unique advantages for SMA research: mice (Mus musculus) serve as the primary disease model with SMN-delta7 and Taiwanese SMA strains; zebrafish (Danio rerio) enable rapid drug screening with motor neuron fluorescent reporters; the naked mole rat (Heterocephalus glaber) shows exceptional neuronal resilience and resistance to neurodegeneration; axolotl offers complete spinal cord regeneration. Conservation scores are computed from NCBI Ortholog data — a score of 71% or higher indicates strong evolutionary conservation, suggesting the target's function is preserved across species and findings from model organisms are likely translatable to humans. Click any species card to see which SMA targets have orthologs in that organism, or click any heatmap cell to view the specific ortholog with links to NCBI Gene and STRING-DB.

Loading species data...

Conservation Heatmap (click cells for details)

Loading heatmap...

Research Directions

EXPLORATORY

16 research directions spanning spatial multi-omics, regenerative biology, and computational approaches to SMA. Click any direction to see connected targets, claims, and hypotheses.

Loading research directions...

Evidence Writer

Generate publication-ready evidence summaries for any SMA target or topic. Powered by Claude Sonnet synthesizing across all platform data (claims, hypotheses, trials, drug outcomes).

SMN2 Grant NCALD Hypothesis Nusinersen Briefing Bioelectric Paper Intro PLS3 Briefing

Molecule Browser

AI-GENERATED

Browse 800+ AI-generated and computationally screened molecules for SMA drug targets. Includes MolMIM scaffold decorations, GenMol analogs, DiffDock docking results, and ML-proxy 100k virtual screen hits. Filter by target, drug-likeness, BBB permeability, and more. Export as CSV (researchers) or SDF (chemists).

What this means for researchers Each molecule here is a potential SMA drug candidate. Molecules are generated by AI (GenMol for de novo design, MolMIM for scaffold optimization) or identified through computational screening of ChEMBL. Key properties to evaluate: QED (drug-likeness, ≥ 0.5 is good, risdiplam is ~0.55), BBB permeability (required for CNS drugs targeting spinal motor neurons), Lipinski compliance (predicts oral bioavailability), and DiffDock confidence (> 0 = predicted binder, benchmark: riluzole +0.082). Click any molecule card to see full properties with interpretation. Use Export SDF for PyMOL/RDKit analysis, Export CSV for spreadsheet work.
Loading molecule statistics...
CSV for spreadsheets • SDF for chemistry tools (PyMOL, RDKit)

CRISPR Guide Design

EXPLORATORY

CRISPR/CRISPRi guide RNA design for SMN2 exon 7 region. Three therapeutic strategies: CRISPRi at ISS-N1 (mimic nusinersen), CRISPRi at ESS (block hnRNP A1), CRISPRa at ESE (enhance Tra2-beta). 20 nt protospacer + NGG PAM, GC 40-70%, polyT filtered.

Why CRISPR for SMA? SMA is caused by a single-nucleotide difference between SMN2 and the lost SMN1 gene. SMN2 exon 7 is mis-spliced due to a silencer element called ISS-N1 (Intronic Splicing Silencer at position N1). CRISPRi targeting ISS-N1 blocks the silencer, forcing exon 7 inclusion — mimicking the mechanism of nusinersen (Spinraza) but as a one-time genomic intervention. GC content of 40–70% optimises guide stability; on-target scores use the Doench 2016 model; specificity scores (CFD) penalise off-target sites. Click any strategy card or guide row for full technical details.
On-Target Score (Doench 2016)
≥0.7 Efficient cleavage expected
0.5-0.7 Moderate efficiency
<0.5 Poor efficiency, avoid
Safety Classification
Safe 0 off-targets with ≤2 mismatches
Caution 1-5 close off-targets
High Risk >5 close off-targets
Published context: Li et al. (2024) demonstrated CRISPRi-mediated ISS-N1 silencing restores SMN2 exon 7 inclusion in patient iPSC motor neurons. See GPU Results CRISPR tab for genome-wide off-target analysis via Cas-OFFinder.

SMN2 Regulatory Motifs

Top Guides (All Strategies)

#StrategySequence (20 nt)PAMStrandRegionGC%On-TargetSpecificity
Loading...

AAV Capsid Evaluation

EXPLORATORY

AAV serotype evaluation for SMA gene therapy delivery. 9 capsids scored across motor neuron tropism, BBB crossing, immunogenicity (NAb seroprevalence), manufacturing feasibility, and packaging capacity. Zolgensma uses AAV9 (scAAV9-SMN1).

Why AAV9 for Zolgensma? AAV9 was chosen for Zolgensma because it combines the highest motor neuron tropism (~90%) with efficient blood-brain barrier crossing in neonates and a proven manufacturing process for clinical-grade production. A key limitation is pre-existing neutralising antibodies (NAbs): patients with anti-AAV9 titres above 1:50 are typically excluded. Alternative capsids — including PHP.B (enhanced CNS transduction), AAVrh10 (broader tropism), and AAV-B1 (high MN specificity) — are in preclinical evaluation. Click any serotype row or strategy card to compare tropism, immunogenicity, and clinical precedent.

Capsid Rankings

#SerotypeMN TropismBBBImmunogenicityMfgPackagingScoreClinical Precedent
Loading...

Gene Edit Versioning

EXPLORATORY

"GitHub for Life" — every SMN2 sequence variant is a deterministic commit with a SHA-256 hash. The disease (SMA) is a single-nucleotide bug (C→T at position 6). Therapeutic edits are patches that restore function. Track the lineage from SMN1 (healthy) through SMN2 (disease) to corrected variants.

"GitHub for Life" — what does that mean? In software, every code change is a versioned commit with a unique hash. Here we apply the same concept to DNA: each SMN gene variant is hashed deterministically, so two sequences produce the same hash if and only if they are identical. The single C→T substitution at exon 7 position 6 that distinguishes SMN1 from SMN2 changes one bit in a 30,000-base sequence — yet this single nucleotide determines whether a patient can walk or not. Therapeutic edits (base editing, prime editing, ASO-mediated splicing correction) are tracked as patches on top of the disease variant. Click any row in the version tree to see the exact base change and its functional impact.

Version Tree

Click any row to expand the full sequence diff, clinical significance, and population frequency.

Commit HashTypeRegionParentEditImpact
Loading...

Sequence Diffs

Each diff shows the exact nucleotide changes between parent and child variant. Position numbers refer to the SMN exon 7 coordinate system.

Molecular Docking

COMPUTATIONAL

Pharmacophore-based docking score prediction for SMA drug candidates against 7 target binding pockets. Scores compounds from the molecule_screenings database by shape complementarity, H-bond potential, hydrophobic match, electrostatic alignment, and strain penalty.

What this means for researchers Docking scores predict how well a small molecule fits into a protein binding pocket. Higher composite scores indicate better predicted binding. The binding class categories are: strong (composite ≥ 0.7, high-confidence predicted binder), moderate (0.4–0.7, worth investigating but uncertain), weak (< 0.4, unlikely to bind at therapeutic concentrations). For DiffDock confidence scores: > 0 = predicted binder, -0.5 to 0 = uncertain, < -1.0 = unlikely. Benchmark: riluzole scores +0.082 against its best target.

Key sub-scores: Shape — geometric fit into the pocket. H-Bond — hydrogen bond donor/acceptor complementarity. Hydrophobic — hydrophobic contact area. Electrostatic — charge complementarity. Strain — penalty for unfavorable ligand conformation (lower is better).

Top Predicted Binders

#CompoundTargetAffinity (kcal/mol)ShapeH-BondHydrophobicScoreClass
Loading...

ML Docking Proxy

COMPUTATIONAL

Machine learning surrogate trained on 4,116 DiffDock v2.2 results. Uses RDKit Morgan fingerprints (ECFP4, 2048-bit) + RandomForest to predict binding confidence ~1000x faster than physics-based docking. Enables screening millions of molecules on CPU in minutes.

Loading ML proxy status...

Actual vs Predicted (Training Set)

Top 20 Feature Importances

#FeatureImportanceBar
Loading...

Target Distribution

Prime Editing Feasibility

EXPLORATORY

Prime editing (PE2/PE3/PEmax) assessment for SMA: SMN2 C6T correction (the root cause fix), ISS-N1 disruption (permanent nusinersen), and ESE strengthening. Compared with approved therapies. Prime editing = reverse transcriptase + Cas9 nickase + pegRNA — no double-strand breaks.

Therapy Comparison

MD Simulations (coming soon)

EXPLORATORY

Molecular Dynamics (MD) simulations model how proteins move, fold, and interact with drug molecules over time. Each simulation runs on GPU hardware using OpenMM, tracking every atom at femtosecond resolution.

What is being simulated? Each row represents a protein (or protein-drug complex) simulated under physiological conditions: 310 K (body temperature), explicit water solvent (TIP3P), 150 mM NaCl, periodic boundary conditions. Simulation types include SMN oligomerization, hnRNP A1-ISS-N1 binding, risdiplam mechanism, NCALD calcium dynamics, PLS3 actin bundling, and SMN-Gemin2 stability.
Key Metrics
RMSD: <2 angstrom plateau = stable fold; increasing = unfolding
Binding energy: <-7 kcal/mol = strong drug binding
Contact persistence: % of time drug maintains key interactions
Verdict
Stable Drug stayed bound throughout simulation
Partial Drug partially dissociated
Dissociated Drug left the binding pocket
What does this mean for drug discovery? DiffDock predicts a static binding pose; MD simulations test whether that pose is dynamically stable. A drug that stays bound for 100 ns is a much stronger candidate than one that dissociates after 10 ns. GPU hours estimate compute cost on a single NVIDIA A100.
SimulationTargetTypePDBAtomsTime (ns)GPU Hours
Loading...

Spatial Multi-Omics

EXPLORATORY

Phase 7.1 — Drug penetration modeling across spinal cord microanatomy. Maps which SMA drugs reach which tissue compartments based on molecular properties, BBB permeability, and CSF exposure. Identifies therapeutic "silent zones" where current drugs underperform.

Spinal Cord Zones

ZoneRegionBBB Perm.CSF Exp.Vasc. DensitySMA RelevanceCell Types
Loading...

Drug Penetration

DrugTypeRouteBest ZoneWorst ZoneVentral HornNMJ
Loading...

Silent Zones

Silent zone analysis requires Slide-seq or MERFISH spatial transcriptomics data. This feature will be populated when real spatial data is integrated from collaborating labs.

Regeneration Signatures

EXPLORATORY

Phase 7.2 — Cross-species regeneration programs in axolotl and zebrafish compared with degeneration in human SMA motor neurons. Identifies conserved repair pathways that are silenced in SMA and could be therapeutically reactivated.

What can SMA research learn from animals that regenerate? Axolotls (Mexican salamanders) and zebrafish can regrow severed spinal cord and peripheral nerve tissue — a capacity completely lost in mammals. By comparing the transcriptional programmes active during their regeneration with the degenerating state of SMA motor neurons, we can identify repair pathways that are silenced in human SMA and might be therapeutically reactivated. Key differences include Wnt/β-catenin signalling (active in regeneration, suppressed in SMA), BDNF/TrkB retrograde survival signals, and cytoskeletal actin dynamics. Genes in the table below are candidates for reactivation strategies. Click any row to see the human ortholog, current SMA expression status, and therapeutic potential.

Regeneration Genes

GeneOrganismHuman OrthologPathwaySMA StatusReactivation Potential
Loading...

Pathway Comparisons

PathwayRegen StateSMA StateGap ScoreStrategy
Loading...

NMJ Retrograde Signaling

EXPLORATORY

Phase 7.3 — Muscle-to-nerve retrograde signaling at the neuromuscular junction. Tests the "happy muscle → surviving neuron" hypothesis: can improving muscle health rescue motor neurons via retrograde trophic signals?

Retrograde Signals

SignalTypeSourceTargetSMA StatusTherap. PotentialEvidence
Loading...

EV Therapeutic Cargo

CargoTypeFunctionSMA RelevanceFeasibility
Loading...

Organ-on-Chip Models

Multisystem SMA

EXPLORATORY

Phase 7.4 — SMA is not just a motor neuron disease. Liver, cardiac, metabolic, pancreatic, vascular, skeletal, and GI pathology emerges especially in severe SMA types. Models the full systemic picture and combination therapy strategies.

Why does SMA affect so many organs? SMN protein is required in every cell — it manages the assembly of RNA splicing machinery (snRNPs). Motor neurons are most sensitive because of their extreme length and metabolic demand, but cardiac muscle, hepatocytes, pancreatic beta cells, and vascular endothelium all suffer when SMN is low. In SMA Type I, >60% of patients show cardiac defects; liver enlargement and metabolic dysfunction are common autopsy findings. This is why systemic treatment — not just spinal delivery — matters. Click any row to expand clinical details and biomarkers.

Affected Organ Systems

SystemOrganSMA TypesPrevalenceSeveritySMN-DependentBiomarkers
Loading...

Combination Therapies

Each strategy targets multiple disease axes simultaneously. Click a card to see drugs involved, mechanism rationale, and clinical evidence.

Energy Budget Model

SMA motor neurons run an energy deficit: SMN loss impairs mitochondrial function, actin dynamics require ATP, and retrograde transport of neurotrophic factors stalls. The energy budget model compares ATP supply vs. demand across normal, SMA, and treated motor neurons. A supply/demand ratio below 0.7 predicts neurodegeneration.

Bioelectric Reprogramming

EXPLORATORY

Phase 7.5 — Ion channel expression, membrane potential (Vmem) states, and electroceutical interventions for SMA motor neurons. Based on Michael Levin's bioelectricity framework: many SMA MNs are alive but electrically dormant — they can potentially be reactivated.

Ion Channels

GeneChannelTypeVmem RoleSMA ExpressionDrug Candidates
Loading...

Vmem States

Electroceuticals

InterventionModalityTarget StateEvidenceFeasibility
Loading...

Cross-Species Splicing Map

EXPLORATORY

Phase 9.3 — Axolotl and zebrafish use alternative splicing as a master switch for regeneration. The same genes exist in humans but their regeneration-promoting isoforms are epigenetically silenced. This module maps 10 regeneration-specific splice events to human orthologs.

Why can axolotls regenerate limbs — and can we copy this in humans? The axolotl (Ambystoma mexicanum) and zebrafish (Danio rerio) switch on alternative mRNA isoforms during injury that activate cell proliferation, cytoskeletal remodeling, and axon re-growth programs. These isoforms are encoded in the same genes humans carry — but in us they are epigenetically silenced after embryonic development. By mapping which exons are alternatively spliced in regenerating animals vs. human SMA motor neurons, we identify candidate ASO (antisense oligonucleotide) targets that could reawaken these dormant programs.
Score interpretation: Conservation measures sequence identity between species (≥0.8 = highly conserved, likely functional in humans). Feasibility estimates ASO targeting potential (considers exon accessibility, splice site strength, and existing ASO precedent). Events with high conservation + high feasibility are the strongest candidates for therapeutic reactivation.
Connection to SMN2 exon 7: Nusinersen (Spinraza) proves that ASO-mediated splice switching works for SMA. The same approach could reactivate regeneration-promoting isoforms in genes like ctnnb1 (Wnt pathway), fgf signaling, and cytoskeletal remodelers — giving motor neurons tools to repair rather than just survive.
Axolotl GeneHuman OrthologEvent TypeExonAxolotl StateHuman SMAConservationFeasibility
Loading...

RNA-Binding Prediction

EXPLORATORY

Phase 9.4 — Predicts RNA-binding affinity of compounds toward SMN2 pre-mRNA regulatory elements. 6 RNA target sites mapped (ISS-N1, 5'ss/U1 interface, ESE2, ESS, branch point, TSL2). Benchmarks against known modulators like risdiplam and branaplam.

RNA Target Sites in SMN2

SiteLocationSequence MotifBinding ProteinsDruggabilityApproved Drug
Loading...

Known SMN2 Modulators

CompoundMWTargetEC50 (nM)Status
Loading...

Dual-Target Molecules

EXPLORATORY

Phase 6.1 — Compounds that simultaneously modify SMN2 splicing AND influence ion channels. The bioelectricity intersection: fixing the gene is not enough — reactivating the electrical function of rescued motor neurons is the missing therapeutic layer.

Why dual-target matters for SMA SMA motor neurons suffer from two simultaneous problems: (1) low SMN protein from mis-spliced SMN2, causing RNA processing defects and progressive degeneration; and (2) electrical dormancy — surviving motor neurons often have altered membrane potentials and reduced excitability, meaning they cannot fire action potentials properly even if SMN is restored. A dual-target compound addresses both problems simultaneously: it corrects SMN2 splicing (like risdiplam) while also modulating ion channels to restore electrical function. This is critical because clinical evidence shows that SMN-restoring therapies alone do not fully restore motor function — the rescued neurons need to be electrically reactivated.

Score interpretation: SMN2 Score — predicted effect on SMN2 exon 7 inclusion (0–1, higher = more inclusion). Channel Score — predicted modulation of the target ion channel (0–1). Composite — weighted combination prioritizing compounds that score well on both axes simultaneously, with BBB permeability as a requirement.
CompoundSMN2 ScoreIon ChannelChannel ScoreBBBCompositeStatus
Loading...

Digital Twin

EXPLORATORY

Phase 10.3 — Multi-scale computational model of the SMA motor neuron. Simulates drug combinations across 5 compartments (soma, axon, NMJ, dendrites, nucleus) and 8 signaling pathways. Predicts synergistic drug combinations in silico.

What is a Digital Twin? A digital twin is a computational replica of a biological system — here, a single SMA-affected alpha motor neuron. Each of the 5 compartments (soma, axon, NMJ, dendrites, nucleus) has its own health baseline, volume, and disease-specific defects derived from SMA omics data. Signalling pathways modelled include mTOR, MAPK, Wnt/β-catenin, BDNF/TrkB, and actin dynamics. Drug combinations are simulated by applying known mechanisms of action to the relevant compartments and scoring the resulting functional recovery. This allows in silico prediction of synergistic combinations before expensive wet-lab experiments. Click any compartment card or pathway row for details, or follow drug links to the Drugs section.
SMA Type III Ambulatory Risdiplam Active 9yr Wearable Data
Genotype
SMN1 del | SMN2: 4 copies
Age / Onset
46yr | Onset 13
HFMSE / RULM
50/66 | 37/37
Treatment
Nusinersen 25x > Risdiplam
356m
6MWT -23.6%
99.1kg
Weight +12kg
4,526
Steps/Day
5.57L
FVC Stable
37/37
RULM
~11x
CK 3.4x ULN
6-Minute Walk Test
Weight Trajectory
Daily Steps (monthly avg)
CK Creatine Kinase
Walking Speed Paradox: Speed UP while Distance DOWN
Gait Analysis — Same metrics used by Capogrosso Lab (Columbia) for spinal stimulation outcome measurement. 9 years of continuous wearable data = unprecedented longitudinal gait monitoring in SMA III.
Step Length (cm)
Double Support % (stability)
Gait Asymmetry %
Lung Function (FVC/FEV1) 8yr
Strength Training Progression
Weight vs 6MWT (r = -0.93)
Treatment Phases Comparison
Correlation Analysis — What Drives 6MWT Performance?
Factor Correlation Strength Interpretation
Body Weight r = -0.93 VERY STRONG 1kg more = ~6m less 6MWT. Dominant factor.
Daily Steps r = +0.72 STRONG More daily activity = better endurance test.
Strength Training r = +0.68 MODERATE Training periods correlate with better 6MWT + lower weight.
Medication confounded UNCLEAR Medication provides baseline stability. Weight/training effects are ON TOP of medication. Without treatment, decline would likely be much faster. Risdiplam too early to judge (5 months).
Lung Function no correlation STABLE FVC 98% predicted. Not a limiting factor. Decline is muscular.
Testable Prediction (Live Tracking)
Prediction: If body weight drops from 99kg to 85kg through consistent training (2x/week) while maintaining Risdiplam therapy, the 6MWT is expected to improve from 356m to approximately 420-440m within 6-12 months. Note: Medication provides the baseline stability — weight management amplifies the effect.
Basis: Linear regression on weight vs 6MWT data (r=-0.93, n=11 paired measurements). Each 1kg reduction = ~6.1m improvement in 6MWT distance.
Tracking: This prediction will be validated with each quarterly 6MWT. Next test expected ~June 2026. If weight reaches 85kg and 6MWT does NOT improve to 420m+, the model is wrong and medication effect is larger than estimated.
Falsification criteria: Weight at 85kg + 6MWT below 380m = medication effect is larger than estimated, weight alone insufficient. Weight at 85kg + 6MWT above 420m = weight management is the strongest modifiable factor on top of medication.
Key Findings from 9 Years of Data
Training works. 13 months of strength training (2021-2022): -5.2% body fat, +1.4kg lean mass, best 6MWT (466m). Muscle building IS possible in SMA III.
Weight is the strongest predictor. 80.7kg = best walking performance. 99.1kg = worst. Every kg matters for an SMA patient.
Lung function is preserved. FVC 5.57L (98% predicted) after 8 years — the 6MWT decline is muscular, not respiratory.
Walking speed paradox. Speed INCREASED (3.3 → 4.1 km/h) while distance DECREASED. Endurance declines faster than peak performance — classic SMA fatigability pattern.
Acute events accelerate decline. Paraspinal episode (Oct 2024) dropped 6MWT by 63m. Recovery incomplete — new baseline lower than before.
Upper limbs preserved. RULM 37/37 (full score). Bench press stable at 12.5-13.5kg. SMA III proximal leg weakness with intact upper extremity function.
Strength training program by Sven Knipphals — Der Chiro, Leipzig · Chiropraktik · Training · Gesundheit
Volbedingstr. 2, 04357 Leipzig
Patient SMA-III-001 | Anonymized | 3.7M health records | 9yr Apple Health + clinical data

Lab-OS

EXPLORATORY

Phase 10.4 — Open-source experiment design automation. 8 standardized SMA assays with timeline and protocol specifications. 3 cloud lab integrations (Emerald Cloud Lab, Strateos, Opentrons). Generates complete experiment designs from hypothesis text.

SMA Assay Library

AssayCategoryReadoutTimelineCostThroughput
Loading...

Cloud Lab Integrations

Federated Learning

EXPLORATORY

Phase 10.5 — Zero-knowledge data sharing framework for SMA research. Enables cross-institutional collaboration without sharing raw patient data. Federated learning protocols, OMOP/OHDSI data model mapping, privacy budget calculator, and 4-tier data sharing framework.

Federated Learning Protocols

ProtocolAlgorithmUse CaseParticipantsUtilityPrivacy
Loading...

Data Sharing Tiers

OMOP/OHDSI Mappings

SMA ConceptOMOP DomainConcept NameVocabularyNotes
Loading...

Translation & Impact

EXPLORATORY

Phase 11 — Translating platform discoveries into real-world impact. Regulatory pathway mapping (FDA/EMA), grant application templates, and a 5-level hypothesis validation pipeline from computational validation to IND filing.

Regulatory Pathways

PathwayAgencyDesignationTimelineSMA DrugsRelevance
Loading...

Grant Templates

Validation Pipeline

LevelNameAssaysTimelineGo/No-Go
Loading...

GPU Computational Results

COMPUTATIONAL

Gold-standard computational predictions from DiffDock, SpliceAI, ESM-2, and Cas-OFFinder. Every result is traceable to its tool version, parameters, and input data. View GPU scripts on GitHub →

GitHub: PDBs, CSVs, Logs → Dropbox: MD Trajectories (40GB) → 4-AP Campaign Data →

Computational Results Overview

Results from RFdiffusion binder design, ProteinMPNN sequence design, ESMFold structure validation, MolMIM/GenMol molecule generation, and DiffDock docking campaigns. All data stored in PostgreSQL and queryable via REST API. Click any card to view details.

How to read these results Every result on this page is a computational prediction, not an experimental measurement. Predictions narrow down which experiments to run first. A positive DiffDock docking score suggests a compound may bind a target — but must be confirmed in a binding assay. A high-pLDDT structure is likely accurate — but should be validated by X-ray crystallography for drug design. Use these results to prioritize wet-lab experiments, not as final proof of drug activity.
Positive = Strong computational support Moderate = Warrants further investigation Weak = Low priority for follow-up
Loading NIM compute summary...

Contact

Questions about the platform, data, or collaboration? Send us a message.

SMA Research Platform
Evidence graph for Spinal Muscular Atrophy research.

Maintained by
Christian Fischer / Bryzant Labs
Leipzig, Germany

Email
bryzant@icloud.com

API
REST API Documentation · Research Links

News & Discoveries

📡 RSS

Research highlights, computational discoveries, and platform updates. Each post documents a specific finding with full methodology and source citations. Comment on findings and join the discussion.

Tags:
Loading news...

Protein Structures

STRUCTURAL

Predicted 3D structures for all SMA research targets using AlphaFold2, ESMfold, and Boltz-2. Each structure is scored by pLDDT (predicted Local Distance Difference Test) — a per-residue confidence metric that tells you how reliable each part of the structure is for drug design.

pLDDT Confidence Interpretation
pLDDT ≥ 90: Very high confidence
Backbone and side-chains are well-modeled. Suitable for docking, pocket detection, and structure-based drug design.
pLDDT 70-90: Confident
Backbone is reliable. Side-chain orientations may vary. Usable for initial virtual screening.
pLDDT 50-70: Low confidence
Structure is unreliable. Often corresponds to flexible loops or poorly conserved regions.
pLDDT < 50: Very low
Likely intrinsically disordered. Do NOT use for docking or drug design. May still have biological function.
Why this matters for SMA drug design: 3D protein structure directly determines which pockets small molecules can bind. AlphaFold and ESMfold provide high-accuracy models for nearly all human proteins. Use the "3D" button to visualize structures interactively. Structures with experimental PDB entries should be preferred when available.

Structures predicted via AlphaFold DB v6 (EMBL-EBI), ESMfold v1, and Boltz-2 (Chai Discovery). Method badges indicate prediction source. pLDDT scores from predicted structures. MW estimates: ~110 Da per residue. Pre-existing PDB structures retain original experimental resolution.

SymbolUniProtSourcepLDDTResiduesDruggabilityBindersMoleculesLinks
Loading structures...

Druggable Pockets

STRUCTURAL

P2Rank-predicted binding pockets across SMA target protein structures. Identifies the cavities on protein surfaces where small molecules can bind — the first step in structure-based drug design.

Pocket Druggability Interpretation P2Rank 2.5.1 (Krivak & Hoksza, J. Cheminf. 2018) uses a random forest classifier trained on experimental protein-ligand complexes to predict binding pockets from protein surface features. Each pocket receives two scores:
Pocket Score (0-100+)
>50 = Well-defined, deep cavity suitable for small molecules
20-50 = Moderate cavity, may require fragment-based approaches
<20 = Shallow or solvent-exposed, poor drug target
Druggability Probability (0-1)
≥0.8 This pocket is targetable by small molecules
0.5-0.8 Possible target, needs optimized ligand design
<0.5 This pocket is too shallow or exposed for standard drugs
What to do next: Proteins with druggable pockets (score >50, probability ≥0.8) should be submitted for virtual screening with DiffDock. Expand any row to see individual pocket residue counts, SAS points, and 3D center coordinates. Cross-reference pocket residues with known drug binding sites from the PDB.

Pocket predictions via P2Rank 2.5.1 with AlphaFold-optimized configuration. Druggable flag requires score > 50 AND probability > 0.8. SAS points = Solvent Accessible Surface connolly dots defining pocket boundary.

ProteinPockets FoundTop ScoreTop ProbabilityDruggableDetails
Loading pocket data...

ADMET Properties

PHARMACOLOGY

ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) predictions for 21,000+ compounds across all SMA targets. Compounds are scored for drug-likeness (QED), blood-brain barrier permeability (BBB), CNS Multi-Parameter Optimization (MPO), Lipinski Rule-of-Five compliance, and physicochemical properties (MW, LogP, TPSA, HBD, HBA). Filter by properties to identify the most promising CNS-penetrant drug candidates for SMA.

CompoundTargetQEDTPSAMWLogPBBBCNS MPOLipinski
Loading ADMET data...

Cross-Paper Synthesis

COMPUTATIONAL

Non-obvious connections across thousands of curated claims from different papers — the platform's core differentiator. While individual papers report isolated findings, cross-paper synthesis reveals hidden patterns: targets that co-occur in unrelated studies, shared mechanisms between seemingly independent pathways, and transitive bridges (if Paper A links X→Y and Paper B links Y→Z, the platform discovers X→Z). The analysis builds a co-occurrence matrix across all claims, identifies statistically significant target pairs, and generates synthesis cards that explain the biological connection with full citation trails. This is how the platform discovered the ROCK-cofilin-actin rod pathway as a therapeutic axis — no single paper described the complete pathway, but the synthesis engine connected findings from 12+ independent publications.

Target ATarget BShared PapersAvg ConfidenceScore
Loading co-occurrences...

Synergy Predictions

HYPOTHESIS

AI-predicted drug-target synergy scores combining docking affinity, literature evidence, pathway overlap, and claim support. Identifies the most promising multi-mechanism therapeutic combinations for SMA.

DrugTargetSynergy ScoreDockingLiteraturePathwayClaims
Loading synergy predictions...

DiffDock v2.2 Molecular Docking

COMPUTATIONAL

DiffDock v2.2 docking predictions. Extended campaign: 224 dockings across 8 targets (ROCK2, MAPK14, LIMK1, SARM1, and more), plus 378-compound batch screen. View protein binders and AI-generated molecules in GPU Results →

#CompoundTargetConfidenceBinding EnergyPose RankStatus
Loading NIM docking results...

Scientific Advisory Pack

Auto-generated comprehensive research summary for external collaborators, professors, and grant reviewers.

Loading advisory pack...

Platform Analytics

Real-time summary of platform capabilities, evidence depth, and research progress.

Loading analytics...

Platform Growth

What this platform has computed since launch. Live numbers from the database, factual milestones, and infrastructure used.

Today's Numbers

Loading stats...

Growth Timeline

Loading timeline...

Pipeline Stats

Loading pipeline...

Computational Resources Used

API Guide for Researchers

Query our evidence graph programmatically. No authentication required for read access. All endpoints return JSON under /api/v2.

Quick Start — 3 Commands

# 1. Platform overview
curl -s https://sma-research.info/api/v2/stats | python3 -m json.tool

# 2. Ranked molecular targets
curl -s "https://sma-research.info/api/v2/scores?mode=discovery" | python3 -m json.tool

# 3. Search drug efficacy claims
curl -s "https://sma-research.info/api/v2/claims?claim_type=drug_efficacy&limit=10" | python3 -m json.tool
Swagger UI (Interactive) ReDoc Reference OpenAPI JSON

Core Data Endpoints

GET /stats
Platform overview counts for all major tables
curl -s https://sma-research.info/api/v2/stats
GET /targets
All molecular targets. Params: target_type, limit (1-2000), offset
curl -s ".../targets?target_type=gene&limit=200"
GET /targets/symbol/{symbol}
Single target by gene symbol (e.g., ROCK2, LIMK1, SMN2)
curl -s ".../targets/symbol/ROCK2"
GET /targets/{id}/deep-dive
Full target view: claims, hypotheses, drugs, trials, network edges
GET /claims
Search claims. Params: claim_type, confidence_min, target, q, enriched
curl -s ".../claims?claim_type=drug_efficacy&confidence_min=0.8&enriched=true"
GET /hypotheses
Ranked hypotheses. Params: status, limit, offset
curl -s ".../hypotheses?limit=20"
GET /scores
7-dimension target prioritization. Params: mode (discovery|clinical), min_score
curl -s ".../scores?mode=discovery"
GET /drugs
Drugs and therapies. Params: approval_status, drug_type
curl -s ".../drugs?approval_status=approved"
GET /trials
Clinical trials from ClinicalTrials.gov
GET /sources
PubMed literature sources. Params: source_type, limit, offset
GET /news
Research highlights and discoveries. Also: /news/rss for RSS feed
GET /search
Semantic + keyword hybrid search. Params: q, mode (semantic|keyword|hybrid)
curl -s ".../search?q=ROCK+inhibitor&mode=hybrid"

Computational Biology

GET /structures
Predicted protein structures with pLDDT scores. Params: symbol, min_plddt
GET /pockets, /pockets/druggable
Binding pockets from fpocket analysis. Filter by symbol
GET /splice/predict?variant=c.6T>C
SMN2 splice variant effect prediction. Also: /splice/known-variants, /splice/elements
GET /molecules/browser
AI-designed molecules (GenMol). Params: target, bbb_only, min_qed
GET /dock/score
Pharmacophore scoring against 7 binding pockets. Params: pocket, limit
GET /interactions/target/{symbol}
Protein-protein and drug-target interaction network for a gene
GET /cascade/predict
Predict downstream signaling cascade effects. Params: gene, perturbation
GET /screen/dual-target
Dual-target screening candidates and synergy predictions

Data Export

GET /export/{table}?fmt=csv
Bulk download as CSV or JSON. Tables: targets, drugs, trials, claims, hypotheses, graph_edges, drug_outcomes, cross_species_targets, target_scores, molecule_screenings
curl -s ".../export/claims?fmt=csv&limit=5000" -o sma_claims.csv
GET /export/target/{symbol}?fmt=bibtex
Export all evidence for a target as JSON, CSV, or BibTeX citations
curl -s ".../export/target/ROCK2?fmt=bibtex"
GET /molecules/browser/export?fmt=sdf
Download AI-designed molecules as SDF (for cheminformatics tools) or CSV

Claim Type Reference

gene_expression protein_interaction pathway_membership drug_target drug_efficacy biomarker splicing_event neuroprotection motor_function survival safety functional_interaction other

Python Example

import requests

BASE = "https://sma-research.info/api/v2"

# Get scored and ranked targets
scores = requests.get(f"{BASE}/scores", params={"mode": "discovery"}).json()

for t in scores[:10]:
    print(f"{t['symbol']:10s} score={t['composite_score']:.3f}")

# Search high-confidence drug efficacy claims
claims = requests.get(f"{BASE}/claims", params={
    "claim_type": "drug_efficacy",
    "confidence_min": 0.8,
    "enriched": True,
    "limit": 100
}).json()

for c in claims:
    print(f"[{c['confidence']:.2f}] {c['predicate'][:80]}")

# Deep-dive: full evidence for a target
target = requests.get(f"{BASE}/targets/symbol/ROCK2").json()
dive = requests.get(f"{BASE}/targets/{target['id']}/deep-dive").json()
print(f"Claims: {len(dive['claims'])}, Hypotheses: {len(dive['hypotheses'])}")

R Example

library(httr)
library(jsonlite)

base_url <- "https://sma-research.info/api/v2"

# All targets with discovery-mode scores
scores <- fromJSON(content(
  GET(paste0(base_url, "/scores"), query = list(mode = "discovery")),
  "text"
))

# Top 10 by composite score
top10 <- head(scores[order(-scores$composite_score), ], 10)
print(top10[, c("symbol", "composite_score")])

# Export as CSV
resp <- GET(paste0(base_url, "/export/targets"), query = list(fmt = "csv", limit = 5000))
writeLines(content(resp, "text"), "sma_targets.csv")

Rate Limits & Access

No authentication required for all GET endpoints.
No formal rate limiting — but please stay under ~10 req/sec sustained.
CORS is restricted to sma-research.info. Use server-side calls or curl from other domains.
Bulk downloads: Use /export endpoints instead of paginating through /claims.
Write access (POST/PUT) requires an admin API key. Contact christian@bryzant.com if needed.

Citation

If you use data from this platform, please cite:

Fischer, C. (2026). SMA Research Platform — Open Evidence Graph
for Spinal Muscular Atrophy. https://sma-research.info
Bryzant Labs. Accessed [date].
BibTeX
@misc{fischer2026sma,
  author = {Fischer, Christian},
  title = {{SMA Research Platform --- Open Evidence Graph for SMA}},
  year = {2026},
  url = {https://sma-research.info},
  note = {Accessed: 2026-03-25}
}

Try It Live

GET /api/v2/
Select an endpoint and click Send to try the API.

Full documentation: Swagger UI | ReDoc | Last updated: 2026-03-25

Latest Claims

Recent evidence claims with source links, quality scores, evidence level, and tissue context. Click paper titles to view on PubMed.

Loading...
Protein Structure
Loading structure...