Schema Extractor

Schema Extractor subpackage. Contains modules and utilities for extracting JSON Schema Using LLMs, Scientific Publications and Domain-Expert Feedbacks.

schema_miner.schema_extractor.extract_schema_stage1(save_schema: bool = False) dict | None

Extract the initial process schema from a given process specification document using the specified Large Language Model (LLM).

Parameters:

save_schema (bool, optional) – If True, save the extracted schema to the specified result file path. Defaults to False.

Returns dict | None:

A dictionary representing the extracted schema, or None if no schema could be extracted.

schema_miner.schema_extractor.extract_schema_stage2(initial_schema: dict | Path, expert_review: str | Path, scientific_paper: str | Path, save_schema: bool = False) dict | None

Refine the initial schema from Stage 1 using domain-expert feedback and scientific publications.

Parameters:
  • initial_schema (dict | Path) – The initial schema produced in Stage 1, provided either as a dictionary or as a file path to a serialized schema.

  • expert_review (str | Path) – Domain-expert feedback on the initial schema, provided either as a direct string or as a file path to a text document.

  • scientific_paper (str | Path) – Relevant scientific publication for schema refinement.

  • save_schema (bool, optional) – If True, save the refined schema to the specified result file path. Defaults to False.

Returns dict | None:

A dictionary representing the refined schema, or None if refinement was unsuccessful.

schema_miner.schema_extractor.extract_schema_stage3(refined_schema: dict | Path, expert_review: str | Path, scientific_paper: str | Path, save_schema: bool = False) dict | None

Finalize the Refined schema from Stage 2 using domain-expert feedback and scientific publications.

Parameters:
  • refined_schema (dict | Path) – The refined schema produced in Stage 2, provided either as a dictionary or as a file path to a serialized schema.

  • expert_review (str | Path) – Domain-expert feedback on the initial schema, provided either as a direct string or as a file path to a text document.

  • scientific_paper (str | Path) – Relevant scientific publication for schema refinement.

  • save_schema (bool, optional) – If True, save the refined schema to the specified result file path. Defaults to False.

Returns dict | None:

A dictionary representing the refined schema, or None if refinement was unsuccessful.