pdftools_toolbox.pdf.structure.node
Classes
|
This class represents a structure element node in the structure element tree of a tagged PDF. |
- class pdftools_toolbox.pdf.structure.node.Node(tag: str, document: Document, page: Page | None)[source]
Bases:
_NativeObject
This class represents a structure element node in the structure element tree of a tagged PDF. Nodes may either have a collection of other nodes as children, or be associated with marked content. These two roles cannot be mixed.
- __init__(tag: str, document: Document, page: Page | None)[source]
- Parameters:
tag (str) – Tags should conform to the Standard Structure Types described within the PDF standard or refer to entries in the RoleMap. Allowed values from the PDF standard are: Document, Part, Sect, Art, Div, H1, H2, H3, H4, H5, H6, P, L, LI, Lbl, LBody, Table, TR, TH, TD, THead, TBody, TFoot, Span, Quote, Note, Reference, Figure, Caption, Artifact, Form, Field, Link, Code, Annot, Ruby, Warichu, TOC, TOCI, Index and BibEntry.
document (pdftools_toolbox.pdf.document.Document) – The document containing the structure element tree.
page (Optional[pdftools_toolbox.pdf.page.Page]) – The page on which marked content associated with the structure element node is to be found. This is optional, but is best omitted for nodes which are not associated with marked content.
- Raises:
StateError – if the object or the owning document has already been closed
- property parent: Node
The parent node in the structure element tree.
- Returns:
pdftools_toolbox.pdf.structure.node.Node
- Raises:
StateError – if the object or the owning document has already been closed
OperationError – if the parent is the structure element tree root node
- property children: NodeList
The list of child nodes under this node in the structure element tree. Once child nodes have been added to a node, it can no longer be associated with marked content.
- Returns:
pdftools_toolbox.pdf.structure.node_list.NodeList
- Raises:
StateError – if the object or the owning document has already been closed
pdftools_toolbox.not_found_error.NotFoundError – if the node’s list of children is invalid
- property tag: str
Tags should conform to the Standard Structure Types described within the PDF standard.
- Returns:
str
- Raises:
StateError – if the object or the owning document has already been closed
pdftools_toolbox.not_found_error.NotFoundError – if the node tag is invalid
- property page: Page | None
The page on which marked content associated with the structure element node is to be found. This is optional, but is best omitted for nodes which are not associated with marked content.
- Returns:
Optional[pdftools_toolbox.pdf.page.Page]
- Raises:
StateError – if the object or the owning document has already been closed
- property alternate_text: str | None
Alternate text to be used where the content denoted by the structure element and its children cannot be rendered because of accessibility or other concerns.
- Returns:
Optional[str]
- Raises:
StateError – if the object or the owning document has already been closed
- property bounding_box: Rectangle | None
Bounding box for contents - should only be set for Figure, Formula and Table
- Returns:
Optional[pdftools_toolbox.geometry.real.rectangle.Rectangle]
- Raises:
StateError – if the object has already been closed