Reference for CSVParser
The CSVParser is dedicated to the parsing of comma-separated Value file (.csv). By default it will attempt to infer the delimiter used (comma, semicolon, ...). Otherwise you may specify the delimiter it should use.
              Bases: AbstractParser
Parser for Comma-Separated Values file (.csv)
            __init__(csv_delimiter=None, output_format='json_lines')
    Initializes a sheet parser
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| csv_delimiter | str | None | The delimiter to consider to parse the .csv files. If None, we will try to guess what the delimiter is. Defaults to None. | None | 
| output_format | Literal["markdown_table", "json_lines"] | the output format of the parsed document. - markdown_table : uses tabula to build a markdown-formatted table. - json_lines : each row of the table will be output as a JSON line. NOTE : consumes way more tokens as column names are repeated at each row. But easier to read for LLMs. Defaults to "json_lines". | 'json_lines' | 
            convert_df_to_json_lines(df)
  
      staticmethod
  
    Converts a DataFrame to json lines.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| df | DataFrame | the dataframe to convert. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | the json lines. | 
            convert_df_to_markdown_table(df)
  
      staticmethod
  
    Converts a DataFrame to markdown. Wraps tabula's method pd.DataFrame.to_markdown() between pre and post processing. Preprocess : - Remove in text columns PostProcess : - Replace multiple spaces with 2 spaces.
   Args:
       df (pd.DataFrame): the dataframe to convert.
   Returns:
       str: a markdown formatted table.
            parse_file(filepath)
    Parses a csv file to markdown.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| filepath | str | the path to the csv file. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| MarkdownDoc | MarkdownDoc | the markdown-formatted csv. | 
            parse_string(string)
    Parses a string representing a csv file to markdown.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| string | str | the csv-formatted string. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| MarkdownDoc | MarkdownDoc | the markdown-formatted csv. | 
            read_file(filepath)
    Read the provided filepath. For a list of handled filetypes, refer to https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| filepath | str | path to the file. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | the csv file content as a string. |