Reference for CSVParser
The CSVParser is dedicated to the parsing of comma-separated Value file (.csv). By default it will attempt to infer the delimiter used (comma, semicolon, ...). Otherwise you may specify the delimiter it should use.
Bases: AbstractSheetParser[str]
Parser for Comma-Separated Values files (.csv).
__init__(csv_delimiter=None, output_format='json_lines', n_sample_lines=5)
Initializes a CSV parser.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
csv_delimiter
|
str | None
|
The delimiter used in the CSV file. If None, the delimiter is inferred automatically. Defaults to None. |
None
|
output_format
|
Literal['markdown_table', 'json_lines']
|
Output format. - markdown_table: renders the data as a Markdown table. - json_lines: each row is serialized as a JSON object on its own line. Repeats column names on every row — more verbose but easier for LLMs. Defaults to "json_lines". |
'json_lines'
|
n_sample_lines
|
int
|
Number of lines sampled for delimiter detection. Only used when csv_delimiter is None. Defaults to 5. |
5
|
parse_file(filepath)
Parses a CSV file to a MarkdownDoc.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
path to the .csv file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
MarkdownDoc |
MarkdownDoc
|
the parsed document. |
parse_string(string)
Parses a CSV-formatted string to a MarkdownDoc.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
string
|
str
|
the CSV content as a string. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
MarkdownDoc |
MarkdownDoc
|
the parsed document. |