API reference¶

This page documents the public surface of LLM-PathwayCurator.
Most users should start with the CLI (llm-pathway-curator run ...). The Python API exists for integration and reproducible orchestration.

CLI¶

Primary entry point: - llm-pathway-curator run ...

The CLI runs the end-to-end pipeline:

EvidenceTable → distill → modules → claims → audit → report

Pipeline¶

End-to-end orchestration (recommended integration point).

llm_pathway_curator.pipeline ¶

RunConfig `dataclass` ¶

RunConfig(
    evidence_table,
    sample_card,
    outdir,
    force=False,
    seed=None,
    run_meta_name="run_meta.json",
    tau=None,
    k_claims=None,
    stress_evidence_dropout_p=None,
    stress_evidence_dropout_min_keep=None,
    stress_contradictory_p=None,
    stress_contradictory_max_extra=None,
)

Pipeline run configuration.

Parameters:

evidence_table (str) –

Path to the input EvidenceTable TSV.
sample_card (str) –

Path to the SampleCard JSON.
outdir (str) –

Output directory path.
force (bool, default: False ) –

If True, allow writing into a non-empty outdir.
seed (int | None, default: None ) –

Random seed used for deterministic steps.
run_meta_name (str, default: 'run_meta.json' ) –

File name for run metadata JSON written under outdir.
tau (float | None, default: None ) –

Optional override for audit threshold tau. If None, uses card.audit_tau().
k_claims (int | None, default: None ) –

Optional override for number of claims to propose.
stress_evidence_dropout_p (float | None, default: None ) –

Probability for evidence gene dropout stress test.
stress_evidence_dropout_min_keep (int | None, default: None ) –

Minimum number of genes to keep per term under dropout stress.
stress_contradictory_p (float | None, default: None ) –

Probability to inject contradictory direction claims.
stress_contradictory_max_extra (int | None, default: None ) –

Cap for number of injected contradictory rows.

Notes

This config is designed to be JSON-serializable via dataclasses.asdict.

run_pipeline ¶

run_pipeline(cfg, *, run_id=None)

Run the full LLM-PathwayCurator pipeline.

Parameters:

cfg (RunConfig) –

Run configuration.
run_id (str | None, default: None ) –

Optional explicit run id. If None, a run id is generated.

Returns:

RunResult –

Summary of the run, including artifact paths and meta_path.

Raises:

FileNotFoundError –

If required input files are missing.
IsADirectoryError –

If a required input path is a directory.
FileExistsError –

If outdir is non-empty and cfg.force is False.
RuntimeError –

If a required step produces zero rows.
Exception –

Any exception raised by underlying steps is propagated after writing run_meta status="error".

Notes

Step order: distill -> modules -> select_claims -> context_review -> stress -> audit -> report -> report_jsonl.

Artifacts and run metadata are written into cfg.outdir. The run_meta.json is updated at each step to support reproducibility and debugging.

Environment variables

Many behaviors can be controlled via env vars, including: - Backend and modes: LLMPATH_BACKEND, LLMPATH_CLAIM_MODE - Context: LLMPATH_CONTEXT_ (gate/review/corpus/weights/rerank) - Stress: LLMPATH_STRESS_ (dropout/contradictory)

Contracts¶

EvidenceTable (TSV contract)¶

EvidenceTable is the normalized term × supporting-genes table used by all downstream stages. It is the stability boundary: if the EvidenceTable is valid, distill/modules/select/audit/report should not break.

llm_pathway_curator.schema ¶

EvidenceTable schema gate for LLM-PathwayCurator. This module defines the tool-facing EvidenceTable contract (v1) that preserves term×gene relationships across enrichment analysis tools (ORA, fgsea/GSEA, etc.). It provides robust IO, conservative column aliasing, spec-owned evidence parsing (delegated to _shared), and provenance metadata (df.attrs) for auditability.

EvidenceTable `dataclass` ¶

EvidenceTable(df)

Tool-facing EvidenceTable wrapper.

This class normalizes heterogeneous enrichment analysis outputs into a stable, auditable internal representation that preserves term×gene relationships.

Notes

The contract requires non-empty term_id, term_name, and evidence_genes.
Parsing/normalization of gene tokens is spec-owned by llm_pathway_curator._shared (e.g., parse_genes, clean_gene_token), to avoid contract drift.
Provenance and health summaries are recorded in df.attrs.

read_tsv `classmethod` ¶

read_tsv(path, *, strict=False, drop_invalid=True)

Read and normalize an evidence table to the contract (v1).

This is the main schema gate that: - aliases common column variants to contract names - cleans required fields - parses evidence_genes via _shared.parse_genes - normalizes numeric fields (stat, qval, pval) - validates the term×gene contract - optionally computes q-values from p-values (BH) when q-values are missing - records provenance and health metrics in df.attrs

Parameters:

path (str) –

Input evidence table path.
strict (bool, default: False ) –

If True, the first invalid row raises ValueError. If False, invalid rows are marked (and optionally dropped). Default is False.
drop_invalid (bool, default: True ) –

If True, drop rows with is_valid=False. Default is True. If False, keep invalid rows and rely on is_valid downstream.

Returns:

EvidenceTable –

Normalized evidence table wrapper.

Raises:

ValueError –

If core required columns are missing after aliasing. If strict=True and an invalid row is encountered.

Notes

Contract-required columns (core) - term_id - term_name - stat - evidence_genes

Output guarantees (post-normalization) - evidence_genes is a list-like object per row (and evidence_genes_str is TSV-safe) - direction is normalized (typically 'up', 'down', 'na') - df.attrs contains: contract_version, read_mode, aliasing, health

Examples:

>>> et = EvidenceTable.read_tsv("evidence_table.tsv")
>>> info = et.summarize()
>>> et.write_tsv("normalized_evidence_table.tsv")

summarize ¶

summarize()

Summarize the normalized EvidenceTable for logging and QA.

Returns:

dict[str, object] –

Summary dictionary including: - contract version - number of terms and sources - direction counts - evidence genes per term quantiles - q-value provenance counts - df.attrs['health'] and df.attrs['aliasing'] (if present)

write_tsv ¶

write_tsv(path)

Write the normalized EvidenceTable to a TSV file.

This writer: - applies a small Excel formula-injection defense for common text fields - serializes evidence_genes as a TSV-friendly string column - emits a stable column order for reproducibility

Parameters:

path (str) –

Output TSV path.

Notes

evidence_genes is written as a joined string under the column name evidence_genes (list form is dropped).
Normalized contract columns are emitted first; remaining columns are sorted.

Sample Card (study context contract)¶

The Sample Card is a structured record of study intent/context (e.g., condition/tissue/perturbation/comparison), used by proposal steps and context validity gates.

llm_pathway_curator.sample_card ¶

SampleCard ¶

Bases: BaseModel

SampleCard: tool-facing context and knob container.

Attributes:

condition, tissue, perturbation, comparison (str) –

Core context keys normalized into stable strings.
notes (str or None) –

Optional free-form notes for humans.
context_tokens_text (str or None) –

Optional free-form text used to derive deterministic context tokens.
context_tokens_policy (dict[str, Any]) –

Tokenization policy for deterministic context tokens.
context_tokens_meta (dict[str, Any]) –

Optional metadata for provenance logging.
k_claims_value (int) –

Top-level k_claims value (stored under JSON key "k_claims").
extra (dict[str, Any]) –

Tool knobs and future-compatible fields. Flattened + alias-canonicalized.

Notes

Contract: - Core context keys are normalized strings; NA is represented by NA_TOKEN. - The neutral disease-like key is "condition" (legacy keys accepted on input). - k_claims is top-level only; it is not stored inside extra. - extra keeps unknown keys for forward compatibility.

apply_patch ¶

apply_patch(patch)

Apply a patch dictionary and return a new SampleCard.

Parameters:

patch (dict[str, Any]) –

Patch values. Core keys are applied at top-level. Other keys are merged into extra.

Returns:

SampleCard –

New SampleCard instance with patch applied.

Notes

Contract enforcement: - Never keeps k_claims or its aliases inside extra. - Accepts legacy disease-like keys to fill "condition" when missing.

audit_min_gene_overlap ¶

audit_min_gene_overlap(default=1)

Get minimum gene overlap for evidence drift checks.

Parameters:

default (int, default: 1 ) –

Default value, by default 1.

Returns:

int –

Minimum overlap threshold.

audit_tau ¶

audit_tau(default=0.8)

Get audit stability tau.

Parameters:

default (float, default: 0.8 ) –

Default value, by default 0.8.

Returns:

float –

Tau value used by the audit layer.

claim_mode ¶

claim_mode(default='deterministic')

Get claim generation mode.

Parameters:

default (str, default: 'deterministic' ) –

Default mode, by default "deterministic".

Returns:

str –

One of {"deterministic", "llm"}.

context_dict ¶

context_dict()

Return core context keys as a dictionary.

Returns:

dict[str, str] –

Mapping from CORE_KEYS to their normalized values.

context_gate_mode ¶

context_gate_mode(default='hard')

Get context gate mode for audit integration.

Parameters:

default (str, default: 'hard' ) –

Default mode, by default "hard".

Returns:

str –

Gate mode normalized to {"off", "note", "hard"}.

context_key ¶

context_key()

Build a stable composite context key string.

Returns:

str –

"condition|tissue|perturbation|comparison" using normalized fields.

context_tokens ¶

context_tokens()

Compute deterministic context tokens used for anchoring.

Returns:

list[str] –

Deterministic token list.

Notes

Priority: 1) context_tokens_text is tokenized via ctx_tokens_v1. 2) Fallback: core context fields are concatenated and tokenized.

context_tokens_effective ¶

context_tokens_effective()

Build a provenance payload for logging (pure function).

Returns:

dict[str, Any] –

Dictionary containing: - version - tokens - n - signature - policy

context_tokens_signature ¶

context_tokens_signature()

Compute a stable short signature for current context tokens.

Returns:

str –

12-hex sha256-based signature.

context_tokens_version ¶

context_tokens_version()

Get effective context tokenization version.

Returns:

str –

Policy version string (currently "ctx_tokens_v1").

enable_context_score_proxy ¶

enable_context_score_proxy(default=False)

Get whether proxy context scoring is enabled.

Parameters:

default (bool, default: False ) –

Default behavior, by default False.

Returns:

bool –

True if proxy context scoring is enabled.

from_json `classmethod` ¶

from_json(path)

Load a SampleCard from a JSON file (tool contract).

Parameters:

path (str or Path) –

Path to a JSON file containing a SampleCard object.

Returns:

SampleCard –

Parsed and normalized SampleCard instance.

Raises:

FileNotFoundError –

If the file does not exist.
ValueError –

If JSON is invalid or not a dict-like object.

Notes

Backward compatibility: - Accepts legacy disease-like keys and hoists into "condition". - Allows k_claims stored in extra or via aliases, but hoists to top-level. - Removes k_claims and its aliases from extra on load.

hub_frac_thr ¶

hub_frac_thr(default=0.5)

Get hub fraction threshold for ABSTAIN_HUB_BRIDGE gating.

Parameters:

default (float, default: 0.5 ) –

Default threshold, by default 0.5.

Returns:

float –

Fraction clamped into [0, 1].

hub_term_degree ¶

hub_term_degree(default=200)

Get hub gene degree threshold for hub-bridge gating.

Parameters:

default (int, default: 200 ) –

Default threshold, by default 200.

Returns:

int –

Threshold (>= 1).

k_claims ¶

k_claims(default=3)

Get number of claims to generate.

Parameters:

default (int, default: 3 ) –

Default count, by default 3.

Returns:

int –

Number of claims (>= 1).

Notes

Top-level k_claims_value has priority. Extra is fallback only.

max_per_module ¶

max_per_module(default=1)

Get maximum claims per module (diversity control).

Parameters:

default (int, default: 1 ) –

Default value, by default 1.

Returns:

int –

Maximum per module (>= 1).

min_union_genes ¶

min_union_genes(default=3)

Get minimum union evidence genes required for support.

Parameters:

default (int, default: 3 ) –

Default minimum, by default 3.

Returns:

int –

Minimum union size (>= 1).

pass_notes ¶

pass_notes(default=True)

Decide whether to emit compact notes for PASS rows.

Parameters:

default (bool, default: True ) –

Default behavior, by default True.

Returns:

bool –

True if PASS rows may receive a short note (e.g., "ok").

preselect_tau_gate ¶

preselect_tau_gate(default=False)

Get whether preselection should apply a tau gate.

Parameters:

default (bool, default: False ) –

Default behavior, by default False.

Returns:

bool –

True if preselection tau gating is enabled.

stability_gate_mode ¶

stability_gate_mode(default='hard')

Get stability gate mode.

Parameters:

default (str, default: 'hard' ) –

Default mode, by default "hard".

Returns:

str –

Gate mode normalized to {"off", "note", "hard"}.

stress_gate_mode ¶

stress_gate_mode(default='off')

Get stress gate mode for audit integration.

Parameters:

default (str, default: 'off' ) –

Default mode, by default "off".

Returns:

str –

Gate mode normalized to {"off", "note", "hard"}.

strict_evidence_check ¶

strict_evidence_check(default=False)

Get strict evidence linkage policy.

Parameters:

default (bool, default: False ) –

Default behavior, by default False.

Returns:

bool –

If True, missing evidence linkage becomes schema violation in audit.

to_json ¶

to_json(path, *, indent=2)

Serialize this SampleCard to JSON.

Parameters:

path (str or Path) –

Output path.
indent (int, default: 2 ) –

JSON indentation level, by default 2.

Returns:

None –

Writes the file.

Notes

Uses model_dump(by_alias=True) so the JSON key is "k_claims".

trust_input_survival ¶

trust_input_survival(default=False)

Decide whether to trust survival values provided in inputs.

Parameters:

default (bool, default: False ) –

Default behavior, by default False.

Returns:

bool –

True if tool should trust input survival rather than recomputing.

Claim schema (typed JSON)¶

Claims are schema-bounded decision objects with resolvable evidence links (term/module identifiers + hashes). Free-text narratives are not treated as evidence.

llm_pathway_curator.claim_schema ¶

Typed, auditable claim schema for LLM-PathwayCurator.

This module defines strict Pydantic models for: - Evidence references (term IDs, optional gene IDs, module ID) - Typed claims (entity, direction, context keys) - Audit decisions (PASS/ABSTAIN/FAIL + reason codes)

Design: - Claim and evidence identifiers are tool-owned and deterministic. - Free-text evidence is disallowed; evidence must be referenced by IDs. - Optional context review fields are supported for audit gating.

Notes

Status vocabulary is intentionally strict to keep denominators auditable.
Gene ID casing is preserved for display; hashing follows tool-wide spec.

AuditedClaim ¶

Bases: BaseModel

Stable audited container.

Attributes:

claim (Claim) –

Typed claim object.
decision (Decision) –

Mechanical decision and reason codes.

Notes

This object is intended as the unit of record for JSONL reports.

Claim ¶

Bases: BaseModel

Typed claim with auditable evidence linkage.

Attributes:

claim_id (str) –

Tool-owned stable identifier. If empty, it is filled deterministically.
entity (str) –

Stable entity identifier (prefer IDs over free text).
direction ({'up', 'down', 'na'}) –

Canonical direction token.
context_keys (list of {"condition", "tissue", "perturbation", "comparison"}) –

Keys the claim is conditioned on. Values live in SampleCard.
evidence_ref (EvidenceRef) –

Evidence reference (IDs only; no free-text evidence).

Optional context review fields

context_evaluated : bool Whether context relevance review was executed. context_method : {"llm", "proxy", "none"} Method used for context review. context_status : {"PASS", "WARN", "FAIL"} or None Result of context review. context_reason : str or None Short reason (length-limited). context_notes : str or None Additional notes (length-limited).

Notes

Invariants enforced: - If context_evaluated is False: method="none" and status/reason/notes are cleared. - If context_evaluated is True: method must be "llm" or "proxy" and status must be provided.

Decision ¶

Bases: BaseModel

Mechanical audit decision for a claim.

Attributes:

status ({'PASS', 'ABSTAIN', 'FAIL'}) –

Final decision label.
reason (str) –

Reason code. Must be "ok" or one of ALL_REASONS.
details (dict) –

Optional structured metadata for debugging or reporting.

Raises:

ValueError –

If reason is not in the allowed vocabulary.

EvidenceRef ¶

Bases: BaseModel

Evidence reference container (strict, tool-friendly).

Attributes:

term_ids (list of str) –

Required. One or more term UID strings that define evidence.
gene_set_hash (str) –

Optional input. If missing/invalid, it is deterministically filled: - from gene_ids when available, else - from term_ids as a fallback.
gene_ids (list of str) –

Optional. Evidence genes for display and hashing (tool spec).
module_id (str) –

Optional. Module identifier for module-level evidence.

Notes

gene_set_hash must be a 12-hex digest (sha256[:12]).
Extra fields are allowed to support non-breaking provenance flags (e.g., gene_set_hash_source).
Term IDs are not uppercased.

Core stages (A → B → C)¶

A) Stability distillation (evidence hygiene)¶

Generates stability proxies from supporting-gene perturbations (e.g., LOO/jackknife-like survival scores). This stage does not decide PASS/ABSTAIN/FAIL.

llm_pathway_curator.distill ¶

distill_evidence ¶

distill_evidence(evidence, card, *, seed=None)

Distill evidence into stability/provenance features (A-stage; deterministic).

This function performs evidence hygiene and produces per-term stability proxies without re-running enrichment. Two modes are supported:

evidence_perturb (default): perturb evidence genes deterministically and compute term survival as the fraction of perturbations that preserve evidence similarity.
replicates_proxy: compute proxy survival from replicate-stacked evidence tables (requires replicate_id; not true patient-level re-run LOO enrichment).

Parameters:

evidence (DataFrame) –

Normalized EvidenceTable-like dataframe with required columns: term_id, term_name, source, stat, qval, direction, evidence_genes.
card (SampleCard) –

Sample card controlling distill knobs under extra (namespaced as distill_*).
seed (int or None, default: None ) –

Global seed for deterministic per-term perturbations.

Returns:

DataFrame –

Distilled table containing stable join keys (term_uid), TSV-friendly genes, survival fields, and knob provenance columns used by downstream modules/audit/report.

Raises:

ValueError –

If required columns are missing, stat is non-numeric, evidence_genes is empty, or replicates_proxy is requested but replicate requirements are not met.

Notes

This stage measures stability and records provenance; it does not decide PASS/ABSTAIN/FAIL.
Contract-critical: term×gene must be preserved post-masking (≥1 evidence gene per term).

B) Evidence modules (term–gene factorization)¶

Constructs the term–gene bipartite graph and extracts evidence modules that preserve shared vs distinct support. This stage does not decide PASS/ABSTAIN/FAIL.

llm_pathway_curator.modules ¶

ModuleOutputs `dataclass` ¶

ModuleOutputs(modules_df, term_modules_df, edges_df)

Container for module factorization outputs.

Attributes:

modules_df (DataFrame) –

Per-module summary table. One row per module_id. Contains stable hashes (terms/genes/content) and representative genes, plus optional survival fields if computed upstream.
term_modules_df (DataFrame) –

Term-to-module assignment table. Contract: one module_id per term_uid.
edges_df (DataFrame) –

Filtered term-by-gene edge table used for module construction. Columns: term_uid, gene_id, weight. Additional debug/provenance lives in edges_df.attrs.

attach_module_drift_stress_tag ¶

attach_module_drift_stress_tag(
    distilled_df,
    drift_df,
    *,
    term_id_col="term_uid",
    stress_col="stress_tag",
    tag="module_drift",
)

Annotate terms with a stress tag when module assignment drifted.

Parameters:

distilled_df (DataFrame) –

Distilled evidence table with term_id_col and an optional stress tag column.
drift_df (DataFrame) –

Drift table containing term_id_col and module_drift (bool).
term_id_col (str, default: 'term_uid' ) –

Term identifier column name (default "term_uid").
stress_col (str, default: 'stress_tag' ) –

Column name used to store stress tags (default "stress_tag").
tag (str, default: 'module_drift' ) –

Tag value to append when drift is detected (default "module_drift").

Returns:

DataFrame –

Copy of distilled_df with updated stress_col. Existing tags are preserved and the new tag is appended if missing.

Raises:

ValueError –

If required columns are missing.

Notes

Does not overwrite non-empty tags; it appends.
Tag splitting/joining is delegated to _shared.split_tags and _shared.join_tags.

attach_module_ids ¶

attach_module_ids(
    evidence_df,
    term_modules_df,
    *,
    term_id_col="term_uid",
    modules_df=None,
)

Attach module identifiers to an evidence table by term_uid.

Parameters:

evidence_df (DataFrame) –

Evidence table that includes term_id_col (typically "term_uid").
term_modules_df (DataFrame) –

Term-to-module table with columns term_id_col and module_id.
term_id_col (str, default: 'term_uid' ) –

Join key column name for term identifiers.
modules_df (DataFrame | None, default: None ) –

Optional per-module table. If provided, module-level survival fields are joined onto each term row.

Returns:

DataFrame –

Copy of evidence_df with: - module_id - module_id_missing (bool) and, optionally, module survival columns if modules_df was provided.

Raises:

ValueError –

If required columns are missing.

build_term_gene_edges ¶

build_term_gene_edges(
    evidence_df,
    *,
    term_id_col="term_uid",
    genes_col="evidence_genes",
)

Build term-by-gene bipartite edges from an evidence table.

Parameters:

evidence_df (DataFrame) –

Evidence table containing at least a term identifier column and a gene evidence column.
term_id_col (str, default: 'term_uid' ) –

Column name for the term identifier in evidence_df.
genes_col (str, default: 'evidence_genes' ) –

Column name for evidence genes in evidence_df. Values can be list-like (preferred) or legacy scalar strings.

Returns:

DataFrame –

Edge table with columns: - term_uid : str - gene_id : str - weight : float

The returned DataFrame also stores a small provenance dict under out.attrs["edges"].

Raises:

ValueError –

If required columns are missing.

Notes

Empty/invalid gene lists produce no edges and are dropped.
List-like gene inputs are processed via vectorized explode.
Scalar/string inputs are parsed via _shared.parse_genes.
Duplicate (term_uid, gene_id) edges are summed into a single row with weight equal to the multiplicity.

compute_term_module_drift ¶

compute_term_module_drift(
    baseline_term_modules_df,
    stressed_term_modules_df,
    *,
    term_id_col="term_uid",
)

Compute per-term drift of module assignment under stress.

Parameters:

baseline_term_modules_df (DataFrame) –

Baseline term-to-module assignments.
stressed_term_modules_df (DataFrame) –

Stressed term-to-module assignments.
term_id_col (str, default: 'term_uid' ) –

Term identifier column name (default "term_uid").

Returns:

DataFrame –

Drift table with columns: - term_uid - module_id_base - module_id_stress - module_drift (bool)

Raises:

ValueError –

If inputs do not have required columns or violate the one-term-one-module contract.

factorize_modules_connected_components ¶

factorize_modules_connected_components(
    evidence_df,
    *,
    method="term_jaccard_cc",
    module_prefix="M",
    max_gene_term_degree=None,
    max_term_degree=None,
    hub_degree_quantile=0.995,
    min_shared_genes=3,
    jaccard_min=0.1,
    term_id_col="term_uid",
    genes_col="evidence_genes",
    sparsity_mode="auto",
    shared_pos_target=0.1,
    sparse_relax_min_shared_genes=2,
    sparse_relax_jaccard_min=0.02,
    pair_sample_max=200000,
    seed=42,
)

Factorize enrichment evidence into stable "evidence modules".

This constructs a term-by-gene bipartite graph from an evidence table and groups related terms into modules. Module identity is stable: module_id is derived from a content hash of (terms, genes).

Parameters:

evidence_df (DataFrame) –

Evidence table containing term identifiers and evidence genes.
method (ModuleMethod, default: 'term_jaccard_cc' ) –

Module construction method. - "term_jaccard_cc": connected components on a term-term graph derived from shared genes (recommended). - "bipartite_cc": connected components on the bipartite graph (legacy).
module_prefix (str, default: 'M' ) –

Prefix prepended to the module_id (default "M").
max_gene_term_degree (int | None, default: None ) –

If set, removes genes whose term-degree is strictly greater than this threshold before module construction.
max_term_degree (int | None, default: None ) –

Deprecated alias for max_gene_term_degree.
hub_degree_quantile (float | None, default: 0.995 ) –

If not None and explicit thresholds are not given, infer the hub degree threshold from the specified quantile of gene term-degree.
min_shared_genes (int, default: 3 ) –

Minimum shared genes for term-term edges (term_jaccard_cc).
jaccard_min (float, default: 0.1 ) –

Minimum Jaccard similarity for term-term edges (term_jaccard_cc).
term_id_col (str, default: 'term_uid' ) –

Column name in evidence_df holding the term identifier. The pipeline convention is "term_uid".
genes_col (str, default: 'evidence_genes' ) –

Column name in evidence_df holding evidence genes.
sparsity_mode (Literal['auto', 'off'], default: 'auto' ) –

If "auto", relaxes thresholds for sparse graphs and may tighten thresholds to avoid giant-component collapse.
shared_pos_target (float, default: 0.1 ) –

Target lower bound for P(shared_genes > 0) under auto sparsity tuning.
sparse_relax_min_shared_genes (int, default: 2 ) –

Relaxed min_shared_genes used when sparsity is detected.
sparse_relax_jaccard_min (float, default: 0.02 ) –

Relaxed jaccard_min used when sparsity is detected.
pair_sample_max (int, default: 200000 ) –

Maximum number of term pairs sampled for sparsity diagnostics.
seed (int, default: 42 ) –

Random seed for sampling-based diagnostics.

Returns:

ModuleOutputs –

Object containing: - modules_df: per-module summary table - term_modules_df: term_uid -> module_id assignments (one per term) - edges_df: filtered edge table used to build modules

Raises:

ValueError –

If an unknown method is requested, required columns are missing, or the term->module contract is violated.

Notes

Hub filtering and sparsity/giant-component heuristics are recorded in edges_df.attrs["modules"] for reproducibility and debugging.
module_id is stable and derived from module content, not from component numbering.

filter_hub_genes ¶

filter_hub_genes(
    edges, *, max_gene_term_degree=200, max_term_degree=None
)

Remove hub genes that connect too many terms (high gene term-degree).

Parameters:

edges (DataFrame) –

Edge table with columns term_uid and gene_id.
max_gene_term_degree (int | None, default: 200 ) –

Hub threshold. Genes with term-degree strictly greater than this value are removed. If None, no hub filtering is applied.
max_term_degree (int | None, default: None ) –

Deprecated alias for max_gene_term_degree. If provided and max_gene_term_degree is None, it is used as the threshold.

Returns:

DataFrame –

Filtered edge table. Hub filter metadata is recorded in out.attrs["hub_filter"].

Raises:

ValueError –

If edges does not have the required columns.

Notes

The filter uses a strict condition: degree > threshold (not >=).

summarize_module_drift ¶

summarize_module_drift(drift_df)

Summarize module drift statistics.

Parameters:

drift_df (DataFrame) –

Output of compute_term_module_drift with required columns: term_uid, module_id_base, module_id_stress, module_drift.

Returns:

dict –

Summary metrics including: - n_terms_total, n_terms_drift, term_drift_rate - n_modules_base, n_modules_stress, n_modules_shared - module_churn_rate

C1) Proposal (deterministic baseline / LLM proposal-only)¶

Proposes typed, evidence-linked candidate claims from distilled evidence and modules. Final acceptance is not decided here.

llm_pathway_curator.select ¶

select_claims ¶

select_claims(
    distilled,
    card,
    *,
    k=50,
    mode=None,
    backend=None,
    claim_backend=None,
    review_backend=None,
    context_gate_mode="soft",
    context_review_mode="off",
    seed=None,
    outdir=None,
    **kwargs,
)

C1: Propose schema-locked pathway claims from distilled evidence.

Parameters:

distilled (DataFrame) –

Distilled evidence table (optionally with module_id and context fields).
card (SampleCard) –

Sample card providing context and selection knobs.
k (int, default: 50 ) –

Number of claims to propose.
mode (str or None, default: None ) –

"deterministic" or "llm". If None, resolved from env/card.
backend (BaseLLMBackend or None, default: None ) –

Backend used for LLM claim proposal when mode="llm".
claim_backend (BaseLLMBackend or None, default: None ) –

Reserved for role-based backends (currently not required here).
review_backend (BaseLLMBackend or None, default: None ) –

Backend used for LLM context review (shortlist-only).
context_gate_mode (str, default: 'soft' ) –

Public API legacy default is "soft". Canonical gate modes are off/note/hard; "soft" is ignored to preserve old behavior.
context_review_mode (str, default: 'off' ) –

"off" or "llm". When "llm", fills pipeline-owned context fields before ranking / proposal.
seed (int or None, default: None ) –

Seed for deterministic tie-breaks and optional stress probes.
outdir (str or None, default: None ) –

Output directory for small caches and artifacts.
**kwargs (Any, default: {} ) –

Forward-compatible extra arguments (ignored here).

Returns:

DataFrame –

Proposed claims table. Includes decision-grade claim_json that embeds EvidenceRef with gene_ids and gene_set_hash.

Notes

Selection-time context knobs (env): - LLMPATH_SELECT_CONTEXT_MODE = off|proxy|review - LLMPATH_SELECT_CONTEXT_GATE_MODE = off|note|hard

Pipeline-owned context review columns (if present) are never overwritten except when LLM review is requested and the existing method is not "llm".

llm_pathway_curator.llm_claims ¶

LLM-based claim proposal for LLM-PathwayCurator.

This module proposes structured Claim objects from distilled evidence using an LLM backend. It is designed to be: - contract-driven (stable IDs, deterministic evidence linking), - robust across heterogeneous backends (OpenAI/Gemini/Ollama/local), - audit-grade (persist prompt/candidates/raw/meta artifacts).

Key ideas

Evidence identity is tool-owned (term_uid + gene_set_hash).
Context VALUES are prompt-facing; context KEYS are contract-facing.
FAIL decisions are never "promoted" by thresholding; gating affects non-FAIL.

Notes

This file contains many private helpers. Public entrypoints: - propose_claims_llm - claims_to_proposed_tsv

LLMClaimResult `dataclass` ¶

LLMClaimResult(
    claims, raw_text, used_fallback, notes, meta
)

Container for LLM claim proposal results.

Attributes:

claims (list[Claim]) –

Validated and post-processed claims. Empty if failure/fallback.
raw_text (str) –

Raw JSON text persisted for audit/debug.
used_fallback (bool) –

True if LLM output was unusable or a soft-error occurred.
notes (str) –

Compact status note (e.g., "ok", "post_validate_failed: ...").
meta (dict[str, Any]) –

Metadata used for reproducibility (k, top_n, hashes, backend class, etc.).

build_claim_prompt ¶

build_claim_prompt(*, card, candidates, k)

Build a compact JSON-only prompt for proposing claims.

Parameters:

card (SampleCard) –

Sample card providing context values and stable context keys.
candidates (DataFrame) –

Candidate evidence rows (top_n pool) used as the ONLY selectable source. Expected columns include term_uid, term_id, term_name, direction, and optionally term_survival and gene_ids_suggest/evidence_genes.
k (int) –

Target number of claims to request from the model.

Returns:

str –

Prompt string instructing the model to return valid JSON only.

Notes

The prompt enforces copy-exact rules for: - entity == term_id - evidence_ref.term_ids == [term_uid] Context values are prompt-facing only; identity uses context KEYS.

claims_to_proposed_tsv ¶

claims_to_proposed_tsv(
    *, claims, distilled_with_modules, card
)

Convert proposed claims into a flat TSV-like DataFrame for export.

Parameters:

claims (list[Claim]) –

Proposed claims (typically from propose_claims_llm).
distilled_with_modules (DataFrame) –

Distilled evidence table used to enrich exported rows with term metadata.
card (SampleCard) –

Sample card providing context values (export columns).

Returns:

DataFrame –

Row-wise export with fields including: claim_id, entity, direction, context_keys, term_uid, module_id, gene_ids, term_ids, gene_set_hash, and serialized claim_json.

Notes

Context VALUES are exported as columns for convenience, but MUST NOT be baked into identity (claim_id / gene_set_hash).

propose_claims_llm ¶

propose_claims_llm(
    *,
    distilled_with_modules,
    card,
    backend,
    k,
    seed=None,
    outdir=None,
    artifact_tag=None,
)

Propose claims via an LLM and write audit-grade artifacts.

Parameters:

distilled_with_modules (DataFrame) –

Distilled evidence table with module information (or sufficient columns to derive term_uid). Must contain: - term_uid OR (source, term_id) - term_id, term_name, source Optional: - module_id, gene_set_hash - evidence_genes / evidence_genes_str / gene_ids_suggest - keep_term, term_survival, stat, context_score
card (SampleCard) –

Sample card providing prompt context and contract keys.
backend (BaseLLMBackend) –

LLM backend adapter.
k (int) –

Target number of claims.
seed (int or None, default: None ) –

Optional seed (best-effort; may be ignored).
outdir (str or None, default: None ) –

Output directory for artifacts. If None, no artifacts are written.
artifact_tag (str or None, default: None ) –

Optional tag to avoid overwriting per-call artifacts.

Returns:

LLMClaimResult –

Claims and metadata. On failure, claims may be empty and used_fallback True.

Raises:

ValueError –

If required columns are missing.
RuntimeError –

If LLM is required by contract and call/validation fails.

Notes

Artifacts (when outdir is set): - llm_claims.prompt.json - llm_claims.candidates.json - llm_claims.raw.json - llm_claims.meta.json Plus tagged variants when artifact_tag is provided.

C2) Mechanical audit (decider)¶

Assigns PASS/ABSTAIN/FAIL with precedence (FAIL > ABSTAIN > PASS) using predefined audit gates. Produces standardized reason codes and audit logs.

llm_pathway_curator.audit ¶

audit_claims ¶

audit_claims(claims, distilled, card, *, tau=None)

Mechanically audit claims against distilled evidence and sample context.

Parameters:

claims (DataFrame) –

Claims table. Must include claim_json with Claim schema JSON.
distilled (DataFrame) –

Distilled evidence table. Must provide term linkage via term_uid or (source, term_id). Evidence genes are read from evidence_genes or evidence_genes_str. Stability uses term_survival when available.
card (SampleCard) –

Sample card providing audit knobs and gate modes.
tau (float or None, default: None ) –

Override stability tau. If None, uses card.audit_tau().

Returns:

DataFrame –

Audited claims with status, reasons, and audit notes.

Raises:

ValueError –

If distilled cannot provide term linkage (missing required columns).

Notes

Status priority is: FAIL > ABSTAIN > PASS.

Major checks: - Linkage: term_id -> term_uid resolution; reject unknown/ambiguous terms. - Evidence identity: gene_set_hash match against computed union evidence genes. - Stability: term-level survival aggregation (min across referenced terms). - Under-support: minimum union evidence genes. - Hub-bridge: abstain when evidence is dominated by hub genes. - Context gate: uses claim schema context review, with optional proxy fallback. - Stress probes: optional internal dropout and contradiction probes and/or external stress columns; treated as ABSTAIN (inconclusive), not FAIL.

llm_pathway_curator.audit_reasons ¶

is_abstain_reason ¶

is_abstain_reason(code)

Check whether a reason code is an ABSTAIN reason.

Parameters:

code (str) –

Reason code string.

Returns:

bool –

True if code is in ABSTAIN_REASONS, otherwise False.

Notes

ABSTAIN_REASONS is part of the paper's reproducible output contract and should remain stable.

is_decision_reason ¶

is_decision_reason(code)

Check whether a string is a valid decision reason code.

This includes the sentinel "ok" as well as all known FAIL/ABSTAIN reason codes.

Parameters:

code (str) –

Decision reason code.

Returns:

bool –

True if code is "ok" or is included in ALL_REASONS, otherwise False.

is_fail_reason ¶

is_fail_reason(code)

Check whether a reason code is a FAIL reason.

Parameters:

code (str) –

Reason code string.

Returns:

bool –

True if code is in FAIL_REASONS, otherwise False.

Notes

FAIL_REASONS is part of the paper's reproducible output contract and should remain stable.

is_known_reason ¶

is_known_reason(code)

Check whether a reason code is known by this module.

Parameters:

code (str) –

Reason code string.

Returns:

bool –

True if code is in ALL_REASONS, otherwise False.

Notes

ALL_REASONS excludes "ok" by design. Use is_decision_reason() when you want to accept the "ok" sentinel.

C3) Reporting (decision-grade outputs)¶

Writes decision objects (report.jsonl / report.md) and renders audit logs with provenance.

llm_pathway_curator.report ¶

write_report ¶

write_report(audit_log, distilled, card, outdir)

Write a human-facing markdown report and TSV artifacts.

Outputs

out/report.md (human-facing summary)
out/audit_log.tsv (canonicalized audit log)
out/distilled.tsv (stringified distilled evidence table)
out/risk_coverage.tsv (optional; when calibration functions exist)

Parameters:

audit_log (DataFrame) –

Audit log DataFrame containing PASS/ABSTAIN/FAIL outcomes and supporting fields.
distilled (DataFrame) –

Distilled evidence table DataFrame.
card (SampleCard) –

SampleCard providing analysis context (condition/tissue/etc.).
outdir (str) –

Output directory path.

Returns:

None –

Notes

This function does NOT write report.jsonl. JSONL export is explicit via write_report_jsonl(...).
Gene symbol mapping in this report is DISPLAY-ONLY: it does not affect auditing or evidence identity.
The report remains best-effort and will fall back to a minimal report if required decision columns are missing.

write_report_jsonl ¶

write_report_jsonl(
    audit_log,
    card,
    outdir,
    *,
    run_id,
    method=None,
    tau=None,
    condition=None,
    comparison=None,
    cancer=None,
    disease=None,
)

Write an audit-grade JSONL report artifact (out/report.jsonl).

This export is designed to be robust and reproducible: - Accepts claim_json or common fallbacks as the payload source. - If typed Claim validation fails, emits a minimal stub instead of crashing. - Missing metric columns do not crash the export (nulls are emitted).

Parameters:

audit_log (DataFrame) –

Audit log DataFrame. Required columns: - status - claim JSON payload column (one of: claim_json, claim_json_str, claim_json_raw). If missing, the payload is synthesized from audit-log columns when possible.
card (SampleCard) –

SampleCard used to supply context defaults and optional metadata.
outdir (str) –

Output directory path.
run_id (str) –

Run identifier string. If empty, a UTC timestamp is used.
method (str | None, default: None ) –

Method label. Default is "llm-pathway-curator".
tau (float | None, default: None ) –

Tau value to store in the JSONL. If None, resolves from card.
condition (str | None, default: None ) –

Optional override for the condition label stored in JSONL.
comparison (str | None, default: None ) –

Optional override for the comparison label stored in JSONL.
cancer (str | None, default: None ) –

Backward-compatible alias for condition (discouraged for new use).
disease (str | None, default: None ) –

Backward-compatible alias for condition (discouraged for new use).

Returns:

Path –

Path to the written report.jsonl.

Raises:

ValueError –

If required columns are missing and the claim payload cannot be synthesized.

Notes

This function does not write report.md. Use write_report for the human-facing markdown report.
Developer-only metadata can be enabled via LLMPATH_REPORT_INCLUDE_DEV_META.

Backends (proposal-only LLM)¶

LLM backends are used only for proposal steps (representative selection + typing) when enabled. Backends should support deterministic settings where possible and persist prompt/raw/meta artifacts.

llm_pathway_curator.backends ¶

BaseLLMBackend ¶

Bases: ABC

Backend-agnostic LLM interface.

This class defines a minimal contract for generating text or JSON strings.

Contract

Input prompt : str

Output json_mode=False Returns a single string (free-form). Implementations may return a human-readable error string on failure. json_mode=True Must return either: (a) a valid JSON string parseable by json.loads, or (b) a standardized soft error JSON string: {"error": {"message": "...", "type": "...", "retryable": true/false}}

Notes

Convenience aliases are provided (invoke, call, complete, chat, and *_json helpers). Subclasses should implement generate.

call ¶

call(prompt, **kwargs)

Alias for invoke.

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments.

Returns:

str –

Model output string.

chat ¶

chat(messages, **kwargs)

Best-effort chat wrapper.

Parameters:

messages (Any) –

Chat-like messages. Typically a list of dicts or strings. If a list is provided, the last element's "content" field (if dict) is used as prompt.
**kwargs (Any, default: {} ) –

Optional keyword arguments passed to invoke.

Returns:

str –

Model output string.

Notes

This is intentionally lightweight and is not a full chat protocol implementation. It extracts a prompt and delegates to invoke.

chat_json ¶

chat_json(prompt, **kwargs)

Generate JSON output from a prompt (chat-style helper).

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments (ignored except for future compatibility).

Returns:

str –

JSON string or standardized soft error JSON string.

complete ¶

complete(prompt, **kwargs)

Alias for invoke.

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments.

Returns:

str –

Model output string.

complete_json ¶

complete_json(prompt, **kwargs)

Generate JSON output from a prompt (completion-style helper).

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments (ignored except for future compatibility).

Returns:

str –

JSON string or standardized soft error JSON string.

generate `abstractmethod` ¶

generate(prompt, json_mode=False)

Generate a completion for a given prompt.

Parameters:

prompt (str) –

Input prompt string.
json_mode (bool, default: False ) –

If True, the backend must return a JSON string (or a standardized soft error JSON). If False, free-form text is allowed.

Returns:

str –

Model output. See class-level contract for json_mode behavior.

Raises:

NotImplementedError –

If the backend does not implement this method.

generate_json ¶

generate_json(prompt, **kwargs)

Generate JSON output from a prompt (explicit helper).

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments (ignored except for future compatibility).

Returns:

str –

JSON string or standardized soft error JSON string.

invoke ¶

invoke(prompt, **kwargs)

Invoke the backend with a prompt (alias for generate).

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments. json_mode is recognized.

Returns:

str –

Model output string.

json ¶

json(prompt, **kwargs)

Alias for JSON generation helpers.

Parameters:

prompt (str) –

Input prompt string.
**kwargs (Any, default: {} ) –

Optional keyword arguments.

Returns:

str –

JSON string or standardized soft error JSON string.

GeminiBackend ¶

GeminiBackend(
    api_key,
    model_name="models/gemini-2.0-flash",
    temperature=0.0,
)

Bases: BaseLLMBackend

Google Gemini backend via google-generativeai.

Parameters:

api_key (str) –

Gemini API key.
model_name (str, default: 'models/gemini-2.0-flash' ) –

Gemini model identifier (e.g., "models/gemini-2.0-flash").
temperature (float, default: 0.0 ) –

Sampling temperature.

Notes

In json_mode, response is requested with MIME type "application/json" and validated. Non-JSON output is converted to standardized soft error JSON.

Initialize the Gemini backend.

Parameters:

api_key (str) –

Gemini API key.
model_name (str, default: 'models/gemini-2.0-flash' ) –

Gemini model identifier.
temperature (float, default: 0.0 ) –

Sampling temperature.

Raises:

ImportError –

If google-generativeai is not installed.

generate ¶

generate(prompt, json_mode=False)

Generate a completion using Gemini.

Parameters:

prompt (str) –

Input prompt string.
json_mode (bool, default: False ) –

If True, attempts to enforce JSON output and validates with json.loads.

Returns:

str –

Free-form text (json_mode=False), or a JSON string / standardized soft error JSON (json_mode=True).

LocalLLMBackend ¶

Bases: BaseLLMBackend

Local/offline backend stub.

This backend does not perform real generation. It exists to support offline workflows and testing paths.

Notes

In json_mode, returns a standardized soft error JSON payload.
In text mode, returns a human-readable placeholder string.

generate ¶

generate(prompt, json_mode=False)

Return a placeholder response (local/offline stub).

Parameters:

prompt (str) –

Input prompt string (ignored).
json_mode (bool, default: False ) –

If True, returns standardized soft error JSON.

Returns:

str –

Placeholder text or standardized soft error JSON.

OllamaBackend ¶

OllamaBackend(
    host=None,
    model_name=None,
    temperature=None,
    timeout=None,
)

Bases: BaseLLMBackend

Ollama backend using HTTP API (/api/generate) via urllib.

Parameters:

host (str | None, default: None ) –

Ollama server base URL (e.g., "http://ollama:11434").
model_name (str | None, default: None ) –

Ollama model name (e.g., "llama3.1:8b").
temperature (float | None, default: None ) –

Sampling temperature.
timeout (float | None, default: None ) –

Legacy single timeout (seconds) applied to both connect/read timeouts.

Notes

urllib accepts a single timeout value. This implementation stores both connect/read timeouts but uses read_timeout for urllib's timeout.
In json_mode, payload includes "format": "json" and output is validated. Non-JSON output is converted to standardized soft error JSON.

Initialize the Ollama backend.

Parameters:

host (str | None, default: None ) –

Base URL for Ollama server. If None, falls back to env defaults.
model_name (str | None, default: None ) –

Model name. If None, falls back to env defaults.
temperature (float | None, default: None ) –

Sampling temperature. If None, falls back to env default.
timeout (float | None, default: None ) –

Legacy single timeout applied to both connect/read.

Notes

Timeout resolution supports: - New envs: LPC_OLLAMA_CONNECT_TIMEOUT / LLMPATH_OLLAMA_CONNECT_TIMEOUT LPC_OLLAMA_READ_TIMEOUT / LLMPATH_OLLAMA_READ_TIMEOUT - Legacy env: LPC_OLLAMA_TIMEOUT / LLMPATH_OLLAMA_TIMEOUT

generate ¶

generate(prompt, json_mode=False)

Generate a completion using Ollama /api/generate.

Parameters:

prompt (str) –

Input prompt string.
json_mode (bool, default: False ) –

If True, requests JSON output and validates with json.loads.

Returns:

str –

Free-form text (json_mode=False), or a JSON string / standardized soft error JSON (json_mode=True).

Notes

Adaptive read-timeout escalation is applied on timeout errors: read_timeout *= factor up to a max, for a limited number of escalations.
connect_timeout is stored for metadata/documentation only and is not used by urllib (single-timeout limitation).

OpenAIBackend ¶

OpenAIBackend(
    api_key, model_name="gpt-4o", temperature=0.0, seed=42
)

Bases: BaseLLMBackend

OpenAI backend using the openai Python SDK (chat completions).

Parameters:

api_key (str) –

OpenAI API key.
model_name (str, default: 'gpt-4o' ) –

Model name (e.g., "gpt-4o").
temperature (float, default: 0.0 ) –

Sampling temperature.
seed (int, default: 42 ) –

Seed used when supported by the API/model. If seeding fails, a fallback call without seed is attempted.

Notes

In json_mode, response_format={"type": "json_object"} is used and output is validated. Non-JSON output is converted to standardized soft error JSON.

Initialize the OpenAI backend.

Parameters:

api_key (str) –

OpenAI API key.
model_name (str, default: 'gpt-4o' ) –

Model name.
temperature (float, default: 0.0 ) –

Sampling temperature.
seed (int, default: 42 ) –

Seed value for deterministic sampling when supported.

Raises:

ImportError –

If the openai package is not installed.

generate ¶

generate(prompt, json_mode=False)

Generate a completion using OpenAI chat completions.

Parameters:

prompt (str) –

Input prompt string.
json_mode (bool, default: False ) –

If True, requests JSON object output and validates with json.loads.

Returns:

str –

Free-form text (json_mode=False), or a JSON string / standardized soft error JSON (json_mode=True).

Notes

If the seeded call fails, a second call without seed is attempted.

get_backend_from_env ¶

get_backend_from_env(seed=None)

Create an LLM backend based on environment variables.

Parameters:

seed (int | None, default: None ) –

Optional seed for backends that support seeded generation.

Returns:

BaseLLMBackend –

Instantiated backend.

Raises:

KeyError –

If a required API key is missing for the selected backend.
ValueError –

If the backend name is unknown.

Notes

Backend selection envs (first non-empty wins): - LPC_BACKEND, BACKEND, LLMPATH_BACKEND

Supported backends: - "openai": uses OpenAI chat completions - "gemini": uses Google Generative AI - "ollama": uses Ollama HTTP API - "local" / "offline": stub backend (no real generation)

Compatibility: - Both "LLMPATH_" and "LPC_" prefixes are accepted for most settings. - For overlapping keys, LPC_ is preferred over vendor env, then LLMPATH_.

retry_with_backoff ¶

retry_with_backoff(retries=3, backoff_in_seconds=1.0)

Decorator factory for exponential backoff retries on backend calls.

Parameters:

retries (int, default: 3 ) –

Maximum number of retry attempts (not counting the initial call).
backoff_in_seconds (float, default: 1.0 ) –

Base backoff duration in seconds. Sleep time grows as: backoff_in_seconds * 2**attempt, with small jitter.

Returns:

callable –

A decorator that wraps a function and retries under certain conditions.

Retry conditions

Retryable exceptions inferred by message heuristics (status/keywords).
Legacy plain-text soft errors: "OpenAI Error: ...", "Gemini Error: ...", "Ollama Error: ..."
Standardized soft error JSON payloads: {"error": {"message": "...", "type": "...", "retryable": ...}}
When json_mode=True: invalid JSON outputs are treated as parse failures and retried at most once.

Notes

json_mode is inferred from kwargs (json_mode=) or from positional ABI: (self, prompt, json_mode=False) when present.

Adapters (Input → EvidenceTable)¶

Adapters normalize upstream enrichment outputs into the EvidenceTable contract. They are intentionally conservative: preserve evidence identity (term × genes), avoid destructive parsing, and keep TSV round-trips stable.

llm_pathway_curator.adapters.fgsea ¶

FgseaAdapterConfig `dataclass` ¶

FgseaAdapterConfig(
    source_name="fgsea",
    require_genes=True,
    keep_pval=True,
    term_id_mode="raw",
    drop_na_qval=True,
    sort_output=True,
)

Configuration for converting an fgsea result table to EvidenceTable.

Attributes:

source_name (str) –

Value to populate the EvidenceTable source column.
require_genes (bool) –

If True, raise an error when leadingEdge yields no genes.
keep_pval (bool) –

If True and pval exists, store it separately (does not replace qval).
term_id_mode (str) –
Term identifier policy.
- "raw": term_id == pathway (recommended; paper-aligned)
- "prefixed_hashed": term_id == "FGSEA:<slug>|<hash>" (legacy)
drop_na_qval (bool) –

If True, drop rows where qval (padj) is missing.
sort_output (bool) –

If True, sort output deterministically by qval asc then abs(stat) desc.

Notes

Defaults are chosen to match the paper-side EvidenceTable behavior: human-readable term IDs, stable ordering, and dropping NA q-values.

read_fgsea_table ¶

read_fgsea_table(path)

Read an fgsea result table from disk.

Supports TSV by default and falls back to delimiter sniffing or whitespace parsing (best-effort).

Parameters:

path (str) –

Path to an fgsea result file.

Returns:

DataFrame –

Parsed fgsea table.

fgsea_to_evidence_table ¶

fgsea_to_evidence_table(fgsea_df, *, config=None)

Convert an fgsea result table to the EvidenceTable contract.

Parameters:

fgsea_df (DataFrame) –

fgsea results table. Must contain (after aliasing) pathway and leadingEdge plus at least one statistic column among NES/ES.
config (FgseaAdapterConfig or None, default: None ) –

Conversion configuration. If None, defaults are used.

Returns:

DataFrame –
EvidenceTable with core columns:
- term_id : str
- term_name : str
- source : str
- stat : float
- qval : float or NA (from padj only)
- direction : {"up", "down", "na"}
- evidence_genes : list[str]
Plus minimal provenance fields (e.g., pval, term_id_h).

Raises:

ValueError –

If required columns are missing, if no stat column is present, if pathway is empty, if the stat column is non-numeric, or if require_genes=True and evidence genes are empty.

Notes

Only padj is treated as q-value (FDR) and mapped to qval. pval is stored separately when present and enabled.
Output ordering can be stabilized via sort_output.

convert_fgsea_table_to_evidence_tsv ¶

convert_fgsea_table_to_evidence_tsv(
    in_path, out_path, *, config=None
)

Read an fgsea table, convert it, and write an EvidenceTable TSV.

This is a convenience wrapper around: read_fgsea_table -> fgsea_to_evidence_table -> TSV write.

Parameters:

in_path (str) –

Path to the fgsea result file.
out_path (str) –

Destination path for the EvidenceTable TSV.
config (FgseaAdapterConfig or None, default: None ) –

Conversion configuration. If None, defaults are used.

Returns:

DataFrame –

EvidenceTable as written, with evidence_genes serialized for TSV.

Raises:

ValueError –

Propagated from fgsea_to_evidence_table on invalid inputs.

llm_pathway_curator.adapters.metascape ¶

MetascapeAdapterConfig `dataclass` ¶

MetascapeAdapterConfig(
    source_name="metascape",
    sheet_name="Enrichment",
    include_summary=False,
    prefer_symbols=True,
    strict_qval=False,
    drop_na_qval=True,
)

Configuration for converting Metascape exports to an EvidenceTable.

Attributes:

source_name (str) –

Value to populate the EvidenceTable source column.
sheet_name (str) –

Excel sheet to read when the input is .xlsx/.xls.
include_summary (bool) –

Whether to include rows whose GroupID ends with "_Summary". The default is False to avoid summary rows being treated as evidence.
prefer_symbols (bool) –

Prefer the Symbols column over Genes when both exist.
strict_qval (bool) –

If True, raise an error when Log(q-value) is present but no valid q-values can be reconstructed.
drop_na_qval (bool) –

If True, drop rows whose reconstructed q-value is missing.

read_metascape_table ¶

read_metascape_table(path, *, sheet_name='Enrichment')

Read a Metascape export file into a DataFrame.

Supports Excel exports (.xlsx/.xls) and delimited text inputs. For Excel, the Enrichment sheet is the canonical input.

Parameters:

path (str) –

Path to a Metascape export file.
sheet_name (str, default: 'Enrichment' ) –

Sheet to read for Excel inputs. Default is "Enrichment".

Returns:

DataFrame –

Parsed Metascape table.

metascape_to_evidence_table ¶

metascape_to_evidence_table(metascape_df, *, config=None)

Convert a Metascape Enrichment table to the EvidenceTable contract.

The resulting EvidenceTable is term-centric (one row per term) and carries evidence genes suitable for downstream factorization.

Parameters:

metascape_df (DataFrame) –

Metascape "Enrichment" sheet as a DataFrame.
config (MetascapeAdapterConfig or None, default: None ) –

Conversion configuration. If None, defaults are used.

Returns:

DataFrame –
EvidenceTable with (at minimum) these columns:
- term_id : str
- term_name : str
- source : str
- stat : float
- qval : float
- direction : str (Metascape ORA yields "na")
- evidence_genes : list[str]
Plus provenance/optional columns (e.g., group_id, is_summary).

Raises:

ValueError –

If required columns are missing, if evidence genes are empty for any row, if Term/Description are empty, or if statistic columns are non-numeric.

Notes

q-values are reconstructed from Log(q-value) using sign inference.
stat is made monotone-positive by taking abs(...) of the chosen log column, for ranking and paper-friendly plotting.

convert_metascape_table_to_evidence_tsv ¶

convert_metascape_table_to_evidence_tsv(
    in_path, out_path, *, config=None
)

Read a Metascape export, convert it, and write an EvidenceTable TSV.

This is a convenience wrapper around: read_metascape_table -> metascape_to_evidence_table -> TSV write.

Parameters:

in_path (str) –

Path to the Metascape export (Excel or text).
out_path (str) –

Destination path for the EvidenceTable TSV.
config (MetascapeAdapterConfig or None, default: None ) –

Conversion configuration. If None, defaults are used.

Returns:

DataFrame –

EvidenceTable as written, with evidence_genes serialized for TSV.

Raises:

ValueError –

Propagated from metascape_to_evidence_table when inputs are invalid or evidence cannot be constructed.

Calibration (risk–coverage)¶

Utilities for selecting an operating point (e.g., τ) along the risk–coverage trade-off. This stage does not change evidence identity; it tunes conservativeness.

llm_pathway_curator.calibrate ¶

CalibrationResult `dataclass` ¶

CalibrationResult(method, params)

Calibration result object.

Attributes:

method ({'none', 'temperature', 'isotonic'}) –

Calibration method identifier.
params (dict[str, Any]) –

Method parameters: - temperature: {"T": float} - isotonic: {"model": fitted_model} - none: {}

Notes

This object is serializable only when params are JSON-safe. (isotonic model objects are not JSON-serializable by default.)

apply ¶

apply(probs)

Apply the calibration mapping to probability-like scores.

Parameters:

probs (ndarray) –

Probability array.

Returns:

ndarray –

Calibrated probabilities clipped to (0, 1).

Raises:

ValueError –

If method is unknown or required params are missing.

apply_isotonic ¶

apply_isotonic(model, probs)

Apply a fitted isotonic regression model to probabilities.

Parameters:

model (Any) –

Fitted isotonic regression model with predict.
probs (ndarray) –

Probability array.

Returns:

ndarray –

Calibrated probabilities (float array).

apply_temperature_scaling ¶

apply_temperature_scaling(probs, T)

Apply temperature scaling to probability-like scores in [0, 1].

Parameters:

probs (ndarray) –

1D probability-like array.
T (float) –

Temperature parameter (must be finite and > 0).

Returns:

ndarray –

Calibrated probabilities clipped to (0, 1).

Raises:

ValueError –

If T is invalid.

calibrate_probs ¶

calibrate_probs(
    probs,
    y_true,
    *,
    method="temperature",
    allow_unlabeled=False,
)

Stage-2 calibration entry point.

Parameters:

probs (ndarray) –

1D probability-like array in [0, 1].
y_true (ndarray or None) –

Optional binary labels in {0, 1}.
method (('none', 'temperature', 'isotonic'), default: "none" ) –

Calibration method. Default is "temperature".
allow_unlabeled (bool, default: False ) –

If True and y_true is None, returns a no-op calibration ("none"). If False and y_true is None, refuses to fit.

Returns:

CalibrationResult –

Calibration mapping object.

Raises:

ValueError –

If inputs are invalid or fitting is requested without labels.

Notes

Design intent: - Keep dependencies optional (no scipy). - Temperature scaling uses deterministic grid search.

compute_counts ¶

compute_counts(status)

Count PASS/FAIL/ABSTAIN/TOTAL from a status series (strict validation).

Parameters:

status (Series) –

Status values. Must normalize into {"PASS", "ABSTAIN", "FAIL"}.

Returns:

dict[str, int] –

Counts with keys: {"PASS", "FAIL", "ABSTAIN", "TOTAL"}.

Raises:

ValueError –

If unknown status values are present (strict spec validation).

extract_probs_and_labels ¶

extract_probs_and_labels(
    audit_log, *, prob_col, label_col=None
)

Extract probability-like scores and optional strict binary labels.

Parameters:

audit_log (DataFrame) –

Audit log table.
prob_col (str) –

Column name containing probabilities/scores.
label_col (str or None, default: None ) –

Column name containing labels. Only exact {0,1} accepted.

Returns:

tuple[ndarray, ndarray or None] –

(probs, labels). Labels are returned as int array when provided.

Raises:

ValueError –

If columns are missing or values are non-numeric/non-finite, or labels are not exactly binary {0,1}.

fit_isotonic_regression ¶

fit_isotonic_regression(probs, y_true)

Fit isotonic regression mapping probs -> calibrated probs.

Parameters:

probs (ndarray) –

1D probability-like array in [0, 1].
y_true (ndarray) –

1D binary labels in {0, 1}.

Returns:

Any –

Fitted isotonic regression model (scikit-learn object).

Raises:

ImportError –

If scikit-learn is not available.
ValueError –

If inputs are invalid.

fit_temperature_scaling ¶

fit_temperature_scaling(
    probs, y_true, *, grid=(0.25, 10.0, 80)
)

Fit a single temperature T > 0 by minimizing NLL (binary labels).

Model

p' = sigmoid(logit(p) / T)

Parameters:

probs (ndarray) –

1D probability-like array in [0, 1].
y_true (ndarray) –

1D binary labels in {0, 1}.
grid (tuple[float, float, int], default: (0.25, 10.0, 80) ) –

(t_min, t_max, n_grid). Search is performed in log-space.

Returns:

float –

Best temperature T, clipped to a conservative range [0.25, 10.0].

Raises:

ValueError –

If inputs are invalid or the grid is invalid.

Notes

No scipy dependency: uses deterministic grid search.

risk_coverage_curve ¶

risk_coverage_curve(
    df,
    *,
    score_col,
    status_col="status",
    decision_thresholds=None,
    pass_if_score_ge=True,
    promote_abstain=True,
    fail_on_degenerate=False,
    max_thresholds=200,
)

Build a Risk–Coverage curve by sweeping a PASS threshold.

Parameters:

df (DataFrame) –

Input table containing score and status columns.
score_col (str) –

Column name of probability-like or score values.
status_col (str, default: 'status' ) –

Column name of base status. Default is "status".
decision_thresholds (list of float or None, default: None ) –

Thresholds to sweep. If None, thresholds are derived from scores.
pass_if_score_ge (bool, default: True ) –

If True, PASS when score >= threshold; else PASS when score <= threshold.
promote_abstain (bool, default: True ) –

If True, among non-FAIL items reassign: PASS if threshold satisfied else ABSTAIN. If False, gate only existing PASS -> ABSTAIN below threshold.
fail_on_degenerate (bool, default: False ) –

If True, raise on degenerate score distributions (<=1 unique value).
max_thresholds (int, default: 200 ) –

Max thresholds when auto-deriving. Must be >= 10.

Returns:

DataFrame –

One row per threshold with risk/coverage metrics and metadata fields: threshold, score_col, status_col, pass_if_score_ge, promote_abstain.

Raises:

ValueError –

If required columns are missing, scores are invalid, statuses are invalid, or thresholds are empty/invalid.

Notes

Safety semantics: - FAIL is never changed. - ABSTAIN never enters the risk denominator.

risk_coverage_from_status ¶

risk_coverage_from_status(status)

Compute spec-safe Risk/Coverage metrics from a status series.

Parameters:

status (Series) –

Status values in {"PASS", "ABSTAIN", "FAIL"}.

Returns:

dict[str, float] –
Metrics with explicit denominators:
- coverage_pass_total PASS / TOTAL
- coverage_decided_total (PASS + FAIL) / TOTAL
- risk_fail_given_decided FAIL / (PASS + FAIL)
- risk_fail_total FAIL / TOTAL
- fail_rate_total Alias of FAIL / TOTAL (kept for backward compatibility)
Also includes count fields as floats: n_pass, n_fail, n_abstain, n_decided, n_total

Notes

"decided" = PASS ∪ FAIL (ABSTAIN excluded). FAIL is a negative decision produced by mechanical audits.

Shared utilities (spec-level)¶

Spec-critical helpers for contract stability (NA handling, gene parsing/joining, stable hashes). If you need to compare outputs across versions, this is the layer that prevents drift.

llm_pathway_curator._shared ¶

canonical_sorted_unique ¶

canonical_sorted_unique(xs)

Canonicalize a list of values into sorted unique strings.

Parameters:

xs (list of object) –

Input values.

Returns:

list of str –

Sorted unique tokens after trimming and NA filtering.

clean_gene_token ¶

clean_gene_token(g)

Clean a single gene-like token conservatively.

Parameters:

g (object) –

Gene-like token.

Returns:

str –

Cleaned token.

Notes

Trims whitespace and strips simple quote wrappers.
Removes common list/export wrappers (brackets, trailing separators).
Does NOT force uppercase (species/ID-system dependent).

dedup_preserve_order ¶

dedup_preserve_order(items)

De-duplicate strings while preserving first occurrence order.

Parameters:

items (list of str) –

Input tokens.

Returns:

list of str –

Deduplicated tokens in first-seen order.

Notes

Empty strings are ignored.

excel_force_text ¶

excel_force_text(s)

Prefix a value with a single quote to force Excel to treat it as text.

Parameters:

s (object) –

Input value.

Returns:

str –

Excel-safe text representation. Empty input returns "".

excel_safe_ids ¶

excel_safe_ids(x, *, list_sep=ID_JOIN_DELIM)

Convert an ID field into an Excel-safe, TSV-friendly text string.

This helper accepts either scalar or list-like inputs, parses them via parse_id_list(), joins the IDs with list_sep, and prefixes a single quote to force Excel "Text" interpretation.

Parameters:

x (object) –

Scalar or list-like ID field.
list_sep (str, default: ID_JOIN_DELIM ) –

Join delimiter for the ID list. Default is ID_JOIN_DELIM.

Returns:

str –

Excel-safe text value. Returns "" if the input is NA-like or empty.

hash_gene_set_12hex ¶

hash_gene_set_12hex(genes)

Compute a set-stable gene-set fingerprint (12-hex), preserving case.

Parameters:

genes (list of object) –

Gene tokens.

Returns:

str –

12-character lowercase hex fingerprint.

Notes

Policy: - order-invariant (set-stable) - clean_gene_token() per token - no forced uppercasing (species/ID dependent)

hash_gene_set_12hex_upper ¶

hash_gene_set_12hex_upper(genes)

Compute a legacy-compatible gene-set fingerprint (12-hex), uppercasing IDs.

Parameters:

genes (list of object) –

Gene tokens.

Returns:

str –

12-character lowercase hex fingerprint.

Notes

Use only when you must match older outputs that case-folded gene IDs.

hash_set_12hex ¶

hash_set_12hex(items)

Compute a generic set-stable fingerprint (12-hex) from a list of items.

Parameters:

items (list of object) –

Input items.

Returns:

str –

12-character lowercase hex fingerprint.

Notes

Trims tokens, drops NA-like values, de-duplicates, sorts, then hashes.

is_na_scalar ¶

is_na_scalar(x)

Determine whether a value should be treated as NA as a scalar.

This function avoids calling pandas.isna on list-like containers because it can return array-like results and break boolean contexts.

Parameters:

x (object) –

Input value.

Returns:

bool –

True if x is a scalar NA value (or None). Containers return False.

Notes

Strings like "na"/"nan" are not treated as scalar NA here; use is_na_token() for token-level NA checks.

is_na_token ¶

is_na_token(s)

Check whether a value represents an NA token (case-insensitive).

This is a spec-level helper used across parsing and TSV round-trips. The NA vocabulary is centralized to prevent contract drift.

Parameters:

s (object) –

Input value.

Returns:

bool –

True if s is None or its trimmed lowercase string form is in the NA token set.

Notes

This function treats empty strings as NA.

join_genes_tsv ¶

join_genes_tsv(genes)

Join gene tokens into a TSV-friendly string.

Parameters:

genes (list of object) –

Gene tokens.

Returns:

str –

Genes joined by GENE_JOIN_DELIM.

Notes

Applies clean_gene_token() and drops empty/NA tokens. Does not sort; preserves input order.

join_id_list_tsv ¶

join_id_list_tsv(ids, *, delim=ID_JOIN_DELIM)

Join generic identifiers into a TSV-friendly string.

The join is stable and order-preserving. This function is intentionally not gene-aware to avoid over-normalization at the spec boundary.

Parameters:

ids (list of object) –

Identifiers to join. None/empty/NA-like tokens are dropped.
delim (str, default: ID_JOIN_DELIM ) –

Delimiter for joining. Default is ID_JOIN_DELIM.

Returns:

str –

Joined identifier string.

Notes

Preserves input order (no sorting).
Does not apply clean_gene_token().

join_tags ¶

join_tags(tags, *, delim=STRESS_TAG_DELIM)

Join tags into a canonical stress tag string.

Parameters:

tags (list of object) –

Tag tokens.
delim (str, default: STRESS_TAG_DELIM ) –

Join delimiter. Default is STRESS_TAG_DELIM (comma).

Returns:

str –

Canonical tag string.

Notes

Trims whitespace, drops empties, and de-duplicates in first-seen order.

looks_like_12hex ¶

looks_like_12hex(x)

Check whether a value is exactly 12 lowercase hex characters.

Parameters:

x (object) –

Input value.

Returns:

bool –

True if x matches the 12-hex pattern (lowercase).

make_term_uid ¶

make_term_uid(source, term_id)

Construct a stable term_uid from (source, term_id).

Parameters:

source (object) –

Term source (e.g., "fgsea", "metascape"). Empty maps to "unknown".
term_id (object) –

Term identifier. Caller should ensure it is non-empty.

Returns:

str –

Term UID formatted as ":".

module_hash_content12 ¶

module_hash_content12(terms, genes)

Compute a module content hash binding both term set and gene set (12-hex).

Parameters:

terms (list of object) –

Term identifiers.
genes (list of object) –

Gene tokens.

Returns:

str –

12-character lowercase hex fingerprint.

Notes

Terms: canonical_sorted_unique() (no uppercasing)
Genes: clean_gene_token() + drop NA/empty + sort/dedup (no uppercasing)
Payload format is stable and explicit to prevent ambiguity.

norm_gene_id_upper ¶

norm_gene_id_upper(g)

Normalize a gene token by applying conservative cleaning and uppercasing.

Parameters:

g (object) –

Gene token.

Returns:

str –

Cleaned and uppercased token.

Notes

This is opt-in for legacy compatibility. The default spec policy in this module is to preserve case.

normalize_direction ¶

normalize_direction(x)

Normalize direction vocabulary across schema/distill/audit/select.

Parameters:

x (object) –

Input scalar.

Returns:

str –

One of {"up", "down", "na"}.

Notes

This is a lightweight normalizer. Unrecognized values map to "na".

normalize_gate_mode ¶

normalize_gate_mode(x, *, default='note')

Normalize a gate mode to canonical vocabulary: {"off", "note", "hard"}.

Parameters:

x (object) –

Input value (canonical, synonym, or legacy form).
default (str, default: 'note' ) –

Default to use when x is empty. If invalid, falls back to "note".

Returns:

str –

Canonical gate mode: "off", "note", or "hard".

Notes

Accepted synonyms include: - off: off, none, disable, disabled - note: note, warn, warning, soft - hard: hard, strict, abstain, on, enable, enabled

normalize_status_series ¶

normalize_status_series(s)

Normalize a pandas Series of statuses to uppercase strings.

Parameters:

s (Series) –

Input series.

Returns:

Series –

Series with string dtype, trimmed and uppercased.

Notes

NA values may become strings (e.g., "nan") after astype(str). Always validate with validate_status_values() when needed.

normalize_status_str ¶

normalize_status_str(x)

Normalize a status value into canonical uppercase text.

Parameters:

x (object) –

Input scalar.

Returns:

str –

Uppercased, trimmed string.

Notes

This function does not validate membership in ALLOWED_STATUSES. Use validate_status_values() for strict checking.

parse_genes ¶

parse_genes(x)

Parse evidence genes from messy inputs into a list of cleaned tokens.

Parameters:

x (object) –

Scalar or list-like gene field.

Returns:

list of str –

Cleaned gene tokens, deduplicated in first-seen order.

Notes

Rules: - NA scalars -> [] - list/tuple -> cleaned per-token - set -> sorted for determinism, then cleaned - string -> split conservatively via split_gene_string()

parse_id_list ¶

parse_id_list(x)

Parse a generic ID field into a list of strings.

This is a tolerant parser for ID-like fields (term IDs, module IDs, gene IDs when treated as IDs, etc.). It is intentionally separate from parse_genes(), which is more gene-token-aware.

Parameters:

x (object) –

Scalar or list-like input.

Returns:

list of str –

Parsed IDs in deterministic order.

Notes

Policy: - NA scalars -> [] - list/tuple -> preserve order (dedup) - set -> sorted for determinism (dedup) - string -> split on strong delimiters first: ',', ';', '|' - whitespace split only if all tokens look identifier-like - drop NA tokens and empties

seed_for_term ¶

seed_for_term(seed, term_uid, term_row_id=None)

Create a deterministic per-term integer seed.

The seed is derived from (seed, term_uid, term_row_id) using a stable hash to keep RNG streams reproducible across platforms.

Parameters:

seed (int or None) –

Optional base seed. None maps to 0.
term_uid (str) –

Stable term identifier (e.g., ":").
term_row_id (int or None, default: None ) –

Optional row identifier to avoid collisions for duplicate term_uids.

Returns:

int –

Deterministic unsigned integer seed.

Raises:

ValueError –

If term_row_id cannot be converted to int (when provided).

seed_int_from_payload ¶

seed_int_from_payload(payload, *, mod=2 ** 31 - 1)

Derive a deterministic integer seed from an arbitrary payload.

Parameters:

payload (object) –

Any JSON-serializable payload.
mod (int, default: 2 ** 31 - 1 ) –

Modulus for the resulting seed. Default is 2**31 - 1.

Returns:

int –

Deterministic integer seed in [0, mod).

Notes

Uses sha256_short(..., n=12) to keep stability aligned with other IDs.

sha256_12hex ¶

sha256_12hex(payload)

Compute a deterministic short SHA-256 hash (first 12 hex chars).

Parameters:

payload (str) –

Stable string payload.

Returns:

str –

12-character lowercase hex digest.

sha256_short ¶

sha256_short(obj, n=12)

Compute a deterministic SHA-256 short hash from an arbitrary payload.

Parameters:

obj (object) –

Payload to hash. It is serialized via stable_json_dumps().
n (int, default: 12 ) –

Number of hex characters to return. Default is 12.

Returns:

str –

Lowercase hex digest prefix.

Raises:

ValueError –

If n is not positive.

Notes

For n == 12, this matches the legacy behavior (sha256_12hex).
SHA-256 hex digests have length 64; if n > 64, the output length is effectively capped at 64 by Python slicing.

split_gene_string ¶

split_gene_string(s)

Split a gene string into candidate tokens using conservative rules.

Parameters:

s (str) –

Input gene string.

Returns:

list of str –

Token candidates (not yet fully cleaned).

Notes

Supported formats: - Comma/semicolon/pipe separated: "A,B", "A;B", "A|B" - Bracketed lists: "['A','B']", '["A","B"]', "{A,B}" - Slash-separated as a last resort: "A/B/C" - Whitespace-separated only if all tokens look gene-like

split_tags ¶

split_tags(s, *, delim=STRESS_TAG_DELIM)

Split a stress tag string into normalized tags.

Parameters:

s (object) –

Input scalar tag string.
delim (str, default: STRESS_TAG_DELIM ) –

Canonical delimiter. Default is STRESS_TAG_DELIM (comma).

Returns:

list of str –

Tags in first-seen order.

Notes

Canonical delimiter is comma.
Legacy '+' is tolerated as an additional delimiter.

stable_json_dumps ¶

stable_json_dumps(obj)

Serialize an object to deterministic JSON for hashing/provenance.

Parameters:

obj (object) –

JSON-serializable object.

Returns:

str –

Deterministic JSON string.

Notes

Uses: - sort_keys=True - separators=(",", ":") - ensure_ascii=False

strip_excel_text_prefix ¶

strip_excel_text_prefix(s)

Strip the Excel "force text" prefix from a value.

Excel-safe exports sometimes prefix values with a single quote ('). This helper removes one leading quote to support downstream parsing.

Parameters:

s (object) –

Input value.

Returns:

str –

Cleaned string without a single leading quote.

validate_status_values ¶

validate_status_values(s_norm)

Strict validation: refuse unknown status values (auditable denominators).

Noise modules (gene noise dictionaries)¶

Curated gene-noise patterns used by masking/evidence hygiene steps.

llm_pathway_curator.noise_lists ¶

Noise module definitions (shared asset; conservative by default).

Rationale (paper-facing)

Marker rankings and enrichment evidence often contain ubiquitous programs (e.g., clonotypes, uninformative locus IDs) that can dominate prompts and confuse LLM interpretation. This module centralizes symbol-centric noise definitions that can be applied in prompt-facing layers while preserving evidence identity in PathwayCurator.

Policy (PathwayCurator)

LLM-PathwayCurator evaluates enrichment interpretations as audited decisions. Therefore, we do not pre-emptively remove broad biological programs (cell cycle, interferon, ribosome/mitochondria, HLA, Ig constants) from evidence by default, because they can be true biology and removing them can inflate ABSTAIN via missing/unstable evidence.

Reproducibility

Edit conservatively: changes may affect benchmark comparability. This file is dependency-free and safe to import.

```

API reference¶

CLI¶

Pipeline¶

llm_pathway_curator.pipeline ¶

RunConfig dataclass ¶

run_pipeline ¶

Contracts¶

EvidenceTable (TSV contract)¶

llm_pathway_curator.schema ¶

EvidenceTable dataclass ¶

read_tsv classmethod ¶

summarize ¶

write_tsv ¶

Sample Card (study context contract)¶

llm_pathway_curator.sample_card ¶

SampleCard ¶

apply_patch ¶

audit_min_gene_overlap ¶

audit_tau ¶

claim_mode ¶

context_dict ¶

context_gate_mode ¶

context_key ¶

context_tokens ¶

context_tokens_effective ¶

context_tokens_signature ¶

context_tokens_version ¶

enable_context_score_proxy ¶

from_json classmethod ¶

hub_frac_thr ¶

hub_term_degree ¶

k_claims ¶

max_per_module ¶

min_union_genes ¶

pass_notes ¶

preselect_tau_gate ¶

stability_gate_mode ¶

stress_gate_mode ¶

strict_evidence_check ¶

to_json ¶

trust_input_survival ¶

Claim schema (typed JSON)¶

llm_pathway_curator.claim_schema ¶

AuditedClaim ¶

Claim ¶

Decision ¶

EvidenceRef ¶

Core stages (A → B → C)¶

A) Stability distillation (evidence hygiene)¶

llm_pathway_curator.distill ¶

distill_evidence ¶

B) Evidence modules (term–gene factorization)¶

llm_pathway_curator.modules ¶

ModuleOutputs dataclass ¶

attach_module_drift_stress_tag ¶

attach_module_ids ¶

build_term_gene_edges ¶

compute_term_module_drift ¶

factorize_modules_connected_components ¶

filter_hub_genes ¶

summarize_module_drift ¶

C1) Proposal (deterministic baseline / LLM proposal-only)¶

llm_pathway_curator.select ¶

select_claims ¶

llm_pathway_curator.llm_claims ¶

LLMClaimResult dataclass ¶

build_claim_prompt ¶

claims_to_proposed_tsv ¶

propose_claims_llm ¶

C2) Mechanical audit (decider)¶

llm_pathway_curator.audit ¶

audit_claims ¶

llm_pathway_curator.audit_reasons ¶

is_abstain_reason ¶

is_decision_reason ¶

is_fail_reason ¶

is_known_reason ¶

C3) Reporting (decision-grade outputs)¶

llm_pathway_curator.report ¶

write_report ¶

RunConfig `dataclass` ¶

EvidenceTable `dataclass` ¶

read_tsv `classmethod` ¶

from_json `classmethod` ¶

ModuleOutputs `dataclass` ¶

LLMClaimResult `dataclass` ¶

generate `abstractmethod` ¶

FgseaAdapterConfig `dataclass` ¶

MetascapeAdapterConfig `dataclass` ¶

CalibrationResult `dataclass` ¶