olaf.pipeline.pipeline_component.candidate_term_enrichment package¶

Submodules¶

olaf.pipeline.pipeline_component.candidate_term_enrichment.knowledge_based_enrichment module¶

class olaf.pipeline.pipeline_component.candidate_term_enrichment.knowledge_based_enrichment.KnowledgeBasedCTermEnrichment(knowledge_source: KnowledgeSource, use_synonyms: bool | None = True, enrichment_kinds: Set[str] | None = {'synonyms'})[source]¶

Bases: PipelineComponent

Pipeline component to enrich candidate terms based on an external source of knowledge, e.g., a KG.

Attributes¶

knowledge_sourceKnowledgeSource: The source of knowledge to use for enrichment.
use_synonyms: bool, optional: Wether to use the existing candidate terms synonyms, by default True.
enrichment_kinds: Set[str], optional: The kinds of enrichments to perform. Accepted values are: ‘synonyms’ (default), ‘antonyms’, ‘hypernyms’, and ‘hyponyms’. Other values will be ignored.

check_resources() → None[source]¶: Method to check that the component has access to all its required resources.

get_performance_report() → Dict[str, Any][source]¶

A getter for the pipeline component performance report.: If the component has been optimised, it only returns the best performance. Otherwise, it returns the results obtained with the set parameters.

Returns¶

Dict[str, Any]: The pipeline component performance report.

optimise() → None[source]¶: A method to optimise the pipeline component by tuning the options.

run(pipeline: Pipeline) → None[source]¶

Method that is responsible for the execution of the component.

Parameters¶

pipelinePipeline: The pipeline running.

olaf.pipeline.pipeline_component.candidate_term_enrichment.llm_based_enrichment module¶

class olaf.pipeline.pipeline_component.candidate_term_enrichment.llm_based_enrichment.LLMBasedTermEnrichment(prompt_template: Callable[[str], List[Dict[str, str]]] | None = None, llm_generator: LLMGenerator | None = None)[source]¶

Bases: PipelineComponent

Enrich candidate terms using LLM knowledge.

Attributes¶

prompt_template: Callable[[str], List[Dict[str, str]]]: Prompt template used to give instructions and context to the LLM.
llm_generator: LLMGenerator: The LLM model used to enrich the candidate terms. By default, the zephyr-7b-beta HuggingFace model is used.

check_resources() → None[source]¶: Method to check that the component has access to all its required resources.

get_performance_report() → Dict[str, Any][source]¶

A getter for the pipeline component performance report. If the component has been optimised, it only returns the best performance. Otherwise, it returns the results obtained with the set parameters.

Returns¶

Dict[str, Any]: The pipeline component performance report.

optimise(validation_terms: Set[str], option_values_map: Set[float]) → None[source]¶: A method to optimise the pipeline component by tuning the configuration.

run(pipeline: Pipeline) → None[source]¶

Method that is responsible for the execution of the component.

Parameters¶

pipelinePipeline: The pipeline running.

olaf.pipeline.pipeline_component.candidate_term_enrichment.semantic_based_enrichment module¶

class olaf.pipeline.pipeline_component.candidate_term_enrichment.semantic_based_enrichment.SemanticBasedEnrichment(threshold: float | None = None)[source]¶

Bases: PipelineComponent

Pipeline component to enrich candidate terms based on semantic meaning computed from embeddings similarity. The most similar words in the vocabulary are added as synonyms.

Attributes¶

thresholdfloat, optional: The threshold defines the minimum similarity score required to be synonymous. By default the threshold is set to 0.9.

check_resources() → None[source]¶: Method to check that the component has access to all its required resources.

enrich_term(c_term: CandidateTerm, spacy_model: Language) → None[source]¶: Enrich candidate term synonyms based on most similar words in the vocabulary. Similarity is computed based on vectors cosine similarity measure.

get_performance_report() → Dict[str, Any][source]¶

A getter for the pipeline component performance report. If the component has been optimised, it only returns the best performance. Otherwise, it returns the results obtained with the set parameters.

Returns¶

Dict[str, Any]: The pipeline component performance report.

optimise() → None[source]¶: A method to optimise the pipeline component by tuning the options.

run(pipeline: Pipeline) → None[source]¶

Method responsible for the component execution.

Parameters¶

pipelinePipeline: The pipeline running.

olaf.pipeline.pipeline_component.candidate_term_enrichment package¶

Submodules¶

olaf.pipeline.pipeline_component.candidate_term_enrichment.knowledge_based_enrichment module¶

Attributes¶

Returns¶

Parameters¶

olaf.pipeline.pipeline_component.candidate_term_enrichment.llm_based_enrichment module¶

Attributes¶

Returns¶

Parameters¶

olaf.pipeline.pipeline_component.candidate_term_enrichment.semantic_based_enrichment module¶

Attributes¶

Returns¶

Parameters¶

Module contents¶