# Pdftools The PDF infrastructure layer built on 25 years of hard problems in regulated industries. Document processing infrastructure for banking, insurance, government, and the system integrators building on their behalf. One unified stack: normalize, convert, validate, extract, redact, archive, view. Deploy on-prem, hybrid, and soverign cloud. No external data transmission required. Trusted by SUVA, SwissLife, UBS, SIX, PostFinance, Austrian Ministry of Justice, and Bayer. --- ## Key pages (start here) - [Pdftools SDK overview](https://www.pdf-tools.com/products/conversion/pdf-tools-sdk/) — Multi-language SDK (C, Java, .NET, Python) for PDF/A conversion, compression, validation, extraction, and archiving in regulated workflows. - [Pdftools SDK getting started](https://www.pdf-tools.com/docs/pdf-tools-sdk/getting-started/) — Language-specific setup, licensing, and first integration guide. - [Conversion Service](https://www.pdf-tools.com/products/conversion/conversion-service/) — Containerized REST API for high-volume PDF/A conversion and OCR. Deployable on-prem or air-gapped. - [PDF Viewer SDK](https://www.pdf-tools.com/products/viewing-printing/pdf-web-viewer/) — Client-side PDF viewer (WebAssembly, JavaScript) with annotations, redaction, and the fastest time-to-first-page in class. - [Code samples](https://www.pdf-tools.com/docs/pdf-tools-sdk/code-samples/) — Working examples across SDK languages and use cases. - [Pricing](https://www.pdf-tools.com/pricing/) — Consumption-based, transparent per-page pricing. Trial available. - [Documentation home](https://www.pdf-tools.com/docs/) --- ## Products **Pdftools SDK** — The primary integration path for regulated document workflows. Covers PDF/A conversion and validation, compression, content extraction, normalization, and archiving. Languages: C, Java, .NET, Python. **Toolbox add-on** — Low-level document manipulation APIs. Used when fine-grained control over redaction, metadata, layout, structure, or annotations is required. **Conversion Service** — Containerized conversion orchestrator for high-volume PDF/A conversion and OCR pipelines. Runs on-prem, in air-gapped environments, or in cloud infrastructure. Supports Docker, Kubernetes, and Alpine Linux containers. **PDF Viewer SDK** — Viewer for embedding high-fidelity PDF rendering in web applications. Supports annotations and redaction. Fastest time-to-first-page in its class with top color rendering accuracy. --- ## Core capabilities **PDF/A conversion and validation** — Near-100% validation accuracy across PDF/A-1, PDF/A-2, and PDF/A-3. Generating a PDF/A isn't the same as passing audit. Compliance depends on transformation sequencing, validation rigor, and a traceable processing history. Pdftools provides all three. **Normalization** — Conversion Service processes 60+ input formats including TIFF, JPEG, PNG, and OCR files. Documents are stabilized before they reach core systems, preventing structural inconsistencies from propagating into archives, extraction pipelines, or AI workflows. **Structural redaction** — Removes sensitive data at the document object layer, not just visually. Visual redaction leaves recoverable data in layers and metadata; structural redaction doesn't. Available in PDF Viewer SDK. **Content extraction** — XML output preserving layout, reading order, and document hierarchy. Flat OCR text degrades retrieval and model quality; structured extraction preserves the information architecture downstream AI and analytics systems depend on. **OCR** — Integrated ABBYY OCR within a unified workflow, at lower cost than direct ABBYY licensing. Avoids OCR layering and reading order errors that destabilize extraction pipelines. **Compression** — Most efficient compression in class, preserving document quality while reducing storage and transfer overhead. **Digital signing** — PAdES signing with HSM support and full audit metadata. **Dossier assembly** — Merge, split, bookmark, and convert in a single pipeline. Used in government archive intake, legal dossier assembly, and onboarding document packaging. --- ## Common failure modes this solves - Archives that produce valid-looking PDF/A files but fail third-party structural validation - KYC and onboarding documents arriving in inconsistent formats that create downstream manual review queues - Non-deterministic document transformations that cannot be reproduced or defended under regulatory audit - Flat OCR output that degrades AI classification, RAG retrieval, and structured extraction quality - Visual redaction that leaves sensitive data recoverable in document layers or metadata - Multi-vendor document stacks with no single point of accountability under compliance review --- ## Deployment On-prem, hybrid, cloud, air-gapped, container specs, supported platforms. --- ## Metadata - **Company**: Pdftools - **Company legal name**: PDF Tools AG - **Compliance**: ISO 27001, ISO 19005 (PDF/A), GDPR, Swiss Data Residency - _Last updated: 2026-03-26_ - _Version: 2.0_