Skip to content

Reference for MarkdownLine

The MarkdownLine represents a markdown-formatted line and some associated features (whether the line is a header, or belongs to a code block for example).

Bases: BaseModel

Show JSON schema:
{
  "properties": {
    "text": {
      "description": "the text content of the line",
      "title": "Text",
      "type": "string"
    },
    "line_idx": {
      "description": "the index of the line in the markdown string",
      "title": "Line Idx",
      "type": "integer"
    },
    "isin_code_block": {
      "description": "whether or not the line belongs to a code block",
      "title": "Isin Code Block",
      "type": "boolean"
    },
    "page": {
      "anyOf": [
        {
          "type": "integer"
        },
        {
          "type": "null"
        }
      ],
      "description": "the page the line belongs to (if markdown comes from converted paginated document)",
      "title": "Page"
    }
  },
  "required": [
    "text",
    "line_idx",
    "isin_code_block",
    "page"
  ],
  "title": "MarkdownLine",
  "type": "object"
}

Fields:

is_bullet_point property

whether or not the line is a bullet point

isin_code_block pydantic-field

whether or not the line belongs to a code block

isin_table property

whether or not the line belongs to a table

line_idx pydantic-field

the index of the line in the markdown string

page pydantic-field

the page the line belongs to (if markdown comes from converted paginated document)

text pydantic-field

the text content of the line

get_header_level()

Gets the header level of this line (1-based)

Raises:

Type Description
ValueError

if the line is not a header, raises an error

Returns:

Name Type Description
int int

the header level, h1 headers would return 1