Skip to content

Reference for CSVParser

The CSVParser is dedicated to the parsing of comma-separated Value file (.csv). By default it will attempt to infer the delimiter used (comma, semicolon, ...). Otherwise you may specify the delimiter it should use.

Bases: AbstractSheetParser[str]

Parser for Comma-Separated Values files (.csv).

__init__(csv_delimiter=None, output_format='json_lines', n_sample_lines=5)

Initializes a CSV parser.

Parameters:

Name Type Description Default
csv_delimiter str | None

The delimiter used in the CSV file. If None, the delimiter is inferred automatically. Defaults to None.

None
output_format Literal['markdown_table', 'json_lines']

Output format. - markdown_table: renders the data as a Markdown table. - json_lines: each row is serialized as a JSON object on its own line. Repeats column names on every row — more verbose but easier for LLMs. Defaults to "json_lines".

'json_lines'
n_sample_lines int

Number of lines sampled for delimiter detection. Only used when csv_delimiter is None. Defaults to 5.

5

parse_file(filepath)

Parses a CSV file to a MarkdownDoc.

Parameters:

Name Type Description Default
filepath str

path to the .csv file.

required

Returns:

Name Type Description
MarkdownDoc MarkdownDoc

the parsed document.

parse_string(string)

Parses a CSV-formatted string to a MarkdownDoc.

Parameters:

Name Type Description Default
string str

the CSV content as a string.

required

Returns:

Name Type Description
MarkdownDoc MarkdownDoc

the parsed document.