Skip to content

User guide

This guide shows how to use LLM-PathwayCurator on your own enrichment results.


1) Create an EvidenceTable

Recommended: Use a built-in adapter to generate evidence_table.tsv. See the adapter docs

You can generate an EvidenceTable via: - adapters (recommended), or - manual TSV export if your pipeline already has term × genes.

Minimum required columns - term_id, term_name, source, stat, qval, direction, evidence_genes

Notes - evidence_genes should be a delimiter-joined list (tool accepts common delimiters; canonical export uses ;). - ORA often has direction=na. Rank-based EA may have up/down.


2) Create a Sample Card

A Sample Card is structured study context. Keep it explicit and minimal: - condition / disease - tissue - perturbation - comparison

Use the schema documented in the package docs (and examples).


3) Run the pipeline

llm-pathway-curator run \
  --sample-card sample_card.json \
  --evidence-table evidence_table.tsv \
  --out out/run1/

4) Read outputs

audit_log.tsv

Contains:

  • decision: PASS / ABSTAIN / FAIL
  • reason codes (stable, finite set)
  • pointers to evidence identities

report.md / report.jsonl

Decision objects for downstream consumption:

  • typed claim fields
  • evidence links (term/module identifiers + hashes)
  • audit outcome and reason codes
  • provenance metadata

Optional: rank & visualize (rank / plot-ranked)

If you want a single ranked table and paper-ready plots (bars / packed circles), use:

  • llm-pathway-curator rank → generates a ranked table (typically claims_ranked.tsv)
  • llm-pathway-curator plot-ranked → renders ranked terms/modules from claims_ranked.tsv (recommended) or audit_log.tsv

A) Rank (produce claims_ranked.tsv)

Run rank on an existing run output directory (the directory that contains audit_log.tsv, run_meta.json, etc.).

llm-pathway-curator rank --help
# Use --help to see the supported inputs and output path options.
````

### B) Plot ranked results (bars / packed circles)

`plot-ranked` can auto-detect inputs under `--run-dir`.
Packed circles require an extra dependency:

```bash
python -m pip install circlify

Bars (Metascape-like)

llm-pathway-curator plot-ranked \
  --mode bars \
  --run-dir out/run1 \
  --out-png out/run1/plots/ranked_bars.png \
  --decision PASS \
  --group-by-module \
  --left-strip \
  --strip-labels \
  --bar-color-mode module

Packed circles (modules → terms)

llm-pathway-curator plot-ranked \
  --mode packed \
  --run-dir out/run1 \
  --out-png out/run1/plots/ranked_packed.png \
  --decision PASS \
  --term-color-mode module

Packed circles (direction shading)

llm-pathway-curator plot-ranked \
  --mode packed \
  --run-dir out/run1 \
  --out-png out/run1/plots/ranked_packed.direction.png \
  --decision PASS \
  --term-color-mode direction

Tip (side-by-side layout): plot-ranked uses a stable module_id → M## display rank and stable module colors, so bars and packed circles can be placed next to each other without label/color drift.


5) Tune conservativeness (τ)

τ controls the stability gate operating point. Conceptually:

  • low τ: higher coverage, potentially higher risk
  • high τ: lower coverage, more abstention

Use τ sweeps for analysis; lock a τ for deployment.


6) Optional: enable proposal-only LLM

When enabled, the LLM can:

  • choose context-consistent representatives
  • emit schema-bounded typed claims

It must never:

  • invent evidence
  • output free text as “evidence”
  • decide PASS/ABSTAIN/FAIL

All decisions remain mechanical and are logged.


7) Reproducibility checklist

  • pin tool version (tag / release)
  • record run_meta.json
  • archive inputs (EvidenceTable + Sample Card)
  • prefer Docker / pinned environment for paper matching

Notes

  • For the underlying design, see Concepts.
  • For deterministic reproduction (benchmarks/figures/Source Data), follow paper/README.md.