olaf.repository.corpus_loader package¶
Submodules¶
olaf.repository.corpus_loader.corpus_loader_schema module¶
olaf.repository.corpus_loader.csv_corpus_loader module¶
- class olaf.repository.corpus_loader.csv_corpus_loader.CsvCorpusLoader(corpus_path: str, column_name: str)[source]¶
Bases:
CorpusLoader
Corpus loader for csv file.
Parameters¶
- corpus_pathstr
Path of the text corpus to use.
- column_namestr
Name of the column to use in the csv file.
olaf.repository.corpus_loader.json_corpus_loader module¶
- class olaf.repository.corpus_loader.json_corpus_loader.JsonCorpusLoader(corpus_path: str, json_field: str)[source]¶
Bases:
CorpusLoader
Corpus loader for json files in a same folder.
Parameters¶
- corpus_pathstr
Path of the text corpus to use.
- json_fieldstr
Name of the field to use in json files.
olaf.repository.corpus_loader.text_corpus_loader module¶
- class olaf.repository.corpus_loader.text_corpus_loader.TextCorpusLoader(corpus_path: str)[source]¶
Bases:
CorpusLoader
Corpus loader for text files in a same folder.
If the corpus path is a folder, each text file in the folder is considered one document. If the corpus path is a text file, each line in the text file is considered one document.
Parameters¶
- corpus_pathstr
Path of the text corpus to use. It can be a folder or a file.