Reference for MarkdownLine
The MarkdownLine represents a markdown-formatted line and some associated features (whether the line is a header, or belongs to a code block for example).
Bases: BaseModel
Show JSON schema:
{
"properties": {
"text": {
"description": "the text content of the line",
"title": "Text",
"type": "string"
},
"line_idx": {
"description": "the index of the line in the markdown string",
"title": "Line Idx",
"type": "integer"
},
"isin_code_block": {
"description": "whether or not the line belongs to a code block",
"title": "Isin Code Block",
"type": "boolean"
},
"page": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "the page the line belongs to (if markdown comes from converted paginated document)",
"title": "Page"
}
},
"required": [
"text",
"line_idx",
"isin_code_block",
"page"
],
"title": "MarkdownLine",
"type": "object"
}
Fields:
-
text(str) -
line_idx(int) -
isin_code_block(bool) -
page(int | None)
is_bullet_point
property
whether or not the line is a bullet point
isin_code_block
pydantic-field
whether or not the line belongs to a code block
isin_table
property
whether or not the line belongs to a table
line_idx
pydantic-field
the index of the line in the markdown string
page
pydantic-field
the page the line belongs to (if markdown comes from converted paginated document)
text
pydantic-field
the text content of the line
get_header_level()
Gets the header level of this line (1-based)
Raises:
| Type | Description |
|---|---|
ValueError
|
if the line is not a header, raises an error |
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
the header level, h1 headers would return 1 |