pdftools_toolbox.pdf.structure.tree

Classes

Tree(document)

The logical structure of a document is described by a hierarchy of objects called the structure hierarchy or structure tree.

class pdftools_toolbox.pdf.structure.tree.Tree(document: Document)[source]

Bases: _NativeObject

The logical structure of a document is described by a hierarchy of objects called the structure hierarchy or structure tree.

The structure tree root is not made accessible through this interface, but it permits the creation of and reference to a Document node directly below the structure tree root.

It is only possible to use this interface to create a structure tree on a new document with no content that could have contained document structure copied from an existing document. Attempts either to create a structure tree in a document containing content copied without setting the copy option pdftools_toolbox.pdf.page_copy_options.PageCopyOptions.copy_logical_structure to False or to copy content into a document with a created structure tree afterwards will fail.

When creating a structure element tree, the document metadata will automatically be updated to reflect that this is a tagged PDF.

__init__(document: Document)[source]

Creates a new StructTreeRoot and adds a root-level “Document” node

Parameters:

document (pdftools_toolbox.pdf.document.Document) – the output document with which the returned structure tree is associated

Raises:

ValueError – if the document is invalid, or an input document, or a document where logical structure has been potentially copied from an existing document already

property document_node: Node

The document node at the top of the structure element tree.

Returns:

pdftools_toolbox.pdf.structure.node.Node

Raises:

StateError – if the object or the owning document has already been closed

property role_map: RoleMap

The rolemap for structure elements in the structure tree. If this does not exist it will be created.

Returns:

pdftools_toolbox.pdf.structure.role_map.RoleMap