Conversion workflow
This workflow is engineered specifically for the conversion of documents to PDF 1.x. Unlike the Archive PDF/A workflows, files are only converted to PDF (and not PDF/A), the file format is not validated and the output documents cannot be signed.
The workflow supports these features:
- Compression and optimization for speed or size
- Optical character recognition (OCR)
- Office file conversion (as required)
- Configuration of attachment conversion
Supported file formats for Conversion workflow
This workflow supports these file formats:
Content type | File type |
---|---|
Document formats | PDF 1.x, PDF 2.0, PDF/A-1, PDF/A-2, PDF/A-3 |
Image formats | JPEG, JPEG200, TIFF, BMP, GIF, JBIG2, PNG, HEIC, HEIF, WebP |
EML, MSG (without encryption) | |
Word | DOC, DOT, DOCX, DOCM, DOTX, DOTM, RTF, XML (WordprocessingML 2003) |
Excel | XLS, XLT, XLSX, XLSM, XLTX, XLTM, XML (SpreadsheetML 2003) |
PowerPoint | PPT, PPS, PPTX, PPTM, PPSX, PPSM |
OpenOffice | ODT, ODS, ODP |
Other | CSV, HTML, HTM (prepared for archiving), TXT, XML, ZIP (without password protection) |
Compared to the Archive PDF/A workflows, the Conversion workflow offers these additional features:
Optimize for speed or size
The workflow's profile offers a setting to optimize for processing time (speed) or for minimal document file size.
Convert mode configuration for child documents (Attachments)
Certain child documents can be skipped (removed) during conversion to PDF, such as attachments of emails or PDF documents. The convert mode can be specified based on the type of the child document, its filename, or the type of its parent document.
For example, by default executables attached to an email are removed. If desired, rules can be added to attach files that can not be converted (e.g. PDF documents containing unrendered XFA, HTML documents) in their orignal source format to the resulting output document.
Collect mode configuration
The collect mode configuration defines how a converted document and its child documents are combined. The collect mode can be configured for each document type and also defines how errors are handled.
For example, emails can be converted by creating a PDF collection (Portfolio) of its body and attachments. Or when converting Word documents, all embedded files can be merged to the converted PDF.
Job and document options for the Conversion workflow
The Conversion workflow lets you use job and document options to pass job- and document-specific values to be used when processing documents using the workflow.
Job options
Job options apply to all documents processed in the same job. Any subsequent jobs processed with the workflow profile use the profile's default settings.
Type | Option | Description |
---|---|---|
OCR | OCR | Turn on and off optical character recognition for the job. All settings must be previously set up in the profile. If true , documents included in the job are processed to recognize any images as text (as appropriate). If false , no OCR is performed. |
Metadata | META.AUTHOR | The author of the document |
Metadata | META.TITLE | The title of the document |
Metadata | META.SUBJECT | The subject of the document |
Metadata | META.KEYWORDS | Keywords that apply to the document |
Apart from the standard metadata properties, you can also set extended metadata properties.
Document options
Document options apply only to a specific input. It allows you to determine specific properties based on an individual document, rather than as a global setting (either determined by the job or the profile). Any subsequent jobs processed with the workflow profile use the profile's default settings.
Type | Option | Description |
---|---|---|
Document property | DOC.PASSWORD | Set the password for the document. |