Literature
VALIDATED DATAPubMed papers and patent literature ingested via automated daily pipeline, each scanned by AI (Gemini Flash with Groq fallback) for structured claims about SMA biology, molecular targets, and therapeutic approaches. The ingestion pipeline runs daily at 03:00 UTC, querying PubMed, bioRxiv/medRxiv preprints, ClinicalTrials.gov, and Google Patents. Each abstract passes a two-layer quality filter: first an SMA-relevance gate (must mention SMA, SMN, motor neuron, or approved therapy names), then a post-extraction quality gate that rejects claims about unrelated diseases. Sources are linked to their extracted claims — use the “With claims” filter to see which papers have been processed.
How does Paper Quality scoring work?
Two-layer quality assessment: Every paper receives both a quantitative score (automated metrics) and a qualitative red-flag analysis (AI-powered logic checks).
Layer 1 — Quantitative Score (0-100): Computed automatically via OpenAlex and Crossref APIs.
- Citation impact (30 pts) — log-scaled, capped at 500 citations
- Author h-index (20 pts) — best h-index from first/last author
- Journal presence (15 pts) — peer-reviewed journal = 15, preprint = 5
- Recency (15 pts) — papers <3 years = full, decays over time
- Collaboration (10 pts) — multi-author studies score higher
- Retraction check — retracted papers score 0 (via Crossref)
Layer 2 — Qualitative Red Flags: Each abstract is analyzed by both rule-based pattern matching and an LLM (Gemini Flash) for scientific quality issues.
- Logical contradictions — conclusion directly contradicts the presented data (e.g. claims muscle-specific effect but measures only spinal cord)
- Vehicle/control improvement — control group shows same improvement as treatment, undermining the drug effect
- No blinding — only flagged for drug intervention studies, not basic neuroscience
- Dose-response paradox — higher doses show less effect without explanation
- Small sample size — fewer than 5 subjects per experimental group
- No independent replication — strong therapeutic claims >10 years old with no clinical follow-up
- Overselling language — words like ‘dramatic’, ‘remarkable’ when actual numbers are modest
Composite Score:
Penalty = (High-severity flags x 15) + (Medium x 5) + (Low x 2)
Adjusted Score = max(5, Quantitative Score - Penalty)
Calibration principles:
- Basic science (electrophysiology, histology, imaging) does NOT require blinding — only drug studies
- Normal scientific reasoning is NOT a logic flag — only clear contradictions
- Review papers should have very few flags
- Conservative flagging: when in doubt, do not flag
Quality scoring methodology is fully open source: paper_red_flags.py | source_quality.py