pdftools_toolbox.pdf.content.content_extractor

Classes

ContentExtractor(content)

class pdftools_toolbox.pdf.content.content_extractor.ContentExtractor(content: Content)[source]

Bases: _NativeObject, Iterable

__init__(content: Content)[source]

Create a new content extractor

Parameters:

content (pdftools_toolbox.pdf.content.content.Content) – the content object of a page or group

Raises:
  • OSError – Error reading from the document

  • pdftools_toolbox.corrupt_error.CorruptError – The document is corrupt

  • ValueError – if the document associated with the content object has already been closed

  • ValueError – if the document associated with the content has already been closed

  • ValueError – if the content’s document is an output document

property ungrouping: UngroupingSelection

Configures the extractor’s behavior regarding the selection of groups to be un-grouped. Default value: pdftools_toolbox.pdf.content.ungrouping_selection.UngroupingSelection.NONE .

Returns:

pdftools_toolbox.pdf.content.ungrouping_selection.UngroupingSelection

Raises:

StateError – the object has already been closed

class ContentExtractorIterator(iterator_handle: c_void_p)[source]

Bases: _NativeObject