Skip to content

Concepts

LLM-PathwayCurator is an interpretation QA layer for enrichment analysis (EA).
It transforms EA outputs into audited, decision-grade claims.


What it is (and is not)

It is: - a framework to convert term lists → typed, evidence-linked claims - a mechanical audit suite producing PASS/ABSTAIN/FAIL with reason codes - a way to tune conservativeness via risk–coverage (abstention is a feature)

It is not: - a new enrichment statistic - a free-text summarizer - a biological truth oracle (it audits internal consistency and evidence linkage)


Objects

EvidenceTable (term × gene contract)

One row = one enriched term with explicit supporting genes.
This enables: - term–term overlap (e.g., Jaccard) - term–gene bipartite graph construction - evidence factorization (modules) - stable evidence linkage (hashable gene sets)

Sample Card (study context contract)

A structured record of study intent and context (e.g., condition/tissue/perturbation/comparison).
Used for: - context-conditioned representative selection - context validity gates - context stress tests (e.g., context swap)

Claim (typed JSON; evidence-linked)

A claim is a decision object, not prose.
It must contain resolvable references: - term_id / module_id - supporting-gene set identity (hash) - typed fields (schema-bounded)

Module IDs vs display ranks (M##)

  • module_id is the stable identifier produced by the tool and referenced by downstream artifacts.
  • M01, M02, ... are display ranks (human-facing labels) used for visualization and layout. They must be consistent across plots but should not be treated as stable IDs.

Pipeline responsibilities (A → B → C)

A) Distill (stability distillation; “evidence hygiene”)

  • supporting-gene perturbations (seeded dropout / jitter)
  • survival-like stability proxies (LOO/jackknife, optional extras)
  • does not decide PASS/ABSTAIN/FAIL

B) Modules (evidence factorization)

  • build term–gene graph
  • extract evidence modules (shared vs distinct support)
  • attach module ids / summarize structure
  • does not decide PASS/ABSTAIN/FAIL

C) Claims → Audit → Report

C1 (proposal): select representatives + type claims (LLM optional)
C2 (audit): mechanical gates assign PASS/ABSTAIN/FAIL + reason codes
C3 (report): decision-grade report + provenance

D) Ranked views (presentation utilities)

These steps do not change evidence identity or decisions. They produce ranked summaries and plots for humans.

  • rank: derives a ranked table (e.g., claims_ranked.tsv) for inspection/plotting.
  • plot-ranked: renders Metascape-like bars or packed circles from claims_ranked.tsv (recommended) or audit_log.tsv (fallback).

Decisions

PASS / ABSTAIN / FAIL

  • FAIL: auditable violations (evidence drift, contradictions, schema violations)
  • ABSTAIN: under-supported / unstable / context-nonspecific / stress-inconclusive
  • PASS: survives the predefined gate suite

τ (stability threshold) as an operating point

Higher τ → more conservative (more ABSTAIN, less PASS).
This enables a risk–coverage trade-off.


Stress tests (internal counterfactuals)

Stress tests are specification-driven perturbations (no external knowledge): - context swap: swap Sample Card context keys - evidence dropout: remove supporting genes with probability p

Expected outcome: coverage should decrease and ABSTAIN reasons should shift in a stress-specific way.


Next