Skip to main content

Conversion workflow

This workflow is engineered specifically for the conversion of documents to PDF 1.x. Unlike the Archive PDF/A workflows, files are only converted to PDF (and not PDF/A), the file format is not validated and the output documents cannot be signed.

The workflow supports these features:

  • Compression and optimization for speed or size
  • Optical character recognition (OCR)
  • Office file conversion (as required)
  • Configuration of attachment conversion

Supported file formats for Conversion workflow

This workflow supports these file formats:

Content typeFile type
Document formatsPDF 1.x, PDF 2.0, PDF/A-1, PDF/A-2, PDF/A-3
Image formatsJPEG, JPEG200, TIFF, BMP, GIF, JBIG2, PNG, HEIC, HEIC
EmailEML, MSG (without encryption)
WordDOC, DOT, DOCX, DOCM, DOTX, DOTM, RTF, XML (WordprocessingML 2003)
ExcelXLS, XLT, XLSX, XLSM, XLTX, XLTM, XML (SpreadsheetML 2003)
PowerPointPPT, PPS, PPTX, PPTM, PPSX, PPSM
OpenOfficeODT, ODS, ODP
OtherCSV, HTML, HTM (prepared for archiving), TXT, XML, ZIP (without password protection)

Compared to the Archive PDF/A workflows, the Conversion workflow offers these additional features:

Optimize for speed or size

The workflow's profile offers a setting to optimize for processing time (speed) or for minimal document file size.

Convert mode configuration for child documents (Attachments)

Certain child documents can be skipped (removed) during conversion to PDF, such as attachments of emails or PDF documents. The convert mode can be specified based on the type of the child document, its filename, or the type of its parent document.

For example, by default executables attached to an email are removed. If desired, rules can be added to attach files that can not be converted (e.g. PDF documents containing unrendered XFA, HTML documents) in their orignal source format to the resulting output document.

Collect mode configuration

The collect mode configuration defines how a converted document and its child documents are combined. The collect mode can be configured for each document type and also defines how errors are handled.

For example, emails can be converted by creating a PDF collection (Portfolio) of its body and attachments. Or when converting Word documents, all embedded files can be merged to the converted PDF.

Job and document options for the Conversion workflow

The Conversion workflow lets you use job and document options to pass job- and document-specific values to be used when processing documents using the workflow.

Job options

Job options apply to all documents processed in the same job. Any subsequent jobs processed with the workflow profile use the profile's default settings.

TypeOptionDescription
OCROCRTurn on and off optical character recognition for the job. All settings must be previously set up in the profile. If true, documents included in the job are processed to recognize any images as text (as appropriate). If false, no OCR is performed.
MetadataMETA.AUTHORThe author of the document
MetadataMETA.TITLEThe title of the document
MetadataMETA.SUBJECTThe subject of the document
MetadataMETA.KEYWORDSKeywords that apply to the document
note

Apart from the standard metadata properties, you can also set extended metadata properties.

Document options

Document options apply only to a specific input. It allows you to determine specific properties based on an individual document, rather than as a global setting (either determined by the job or the profile). Any subsequent jobs processed with the workflow profile use the profile's default settings.

TypeOptionDescription
Document propertyDOC.PASSWORDSet the password for the document.