# Pdftools

The PDF infrastructure layer built on 25 years of hard problems in regulated industries.
Document processing infrastructure for banking, insurance, government, and the system
integrators building on their behalf.
One unified stack: normalize, convert, validate, extract, redact, archive, view.
Deploy on-prem, hybrid, and soverign cloud. No external data transmission required.
Trusted by SUVA, SwissLife, UBS, SIX, PostFinance, Austrian Ministry of Justice, and Bayer.

---

## Key pages (start here)

- [Pdftools SDK overview](https://www.pdf-tools.com/products/conversion/pdf-tools-sdk/) — Multi-language SDK
  (C, Java, .NET, Python) for PDF/A conversion, compression, validation, extraction,
  and archiving in regulated workflows.
- [Pdftools SDK getting started](https://www.pdf-tools.com/docs/pdf-tools-sdk/getting-started/)
  — Language-specific setup, licensing, and first integration guide.
- [Conversion Service](https://www.pdf-tools.com/products/conversion/conversion-service/) — Containerized REST
  API for high-volume PDF/A conversion and OCR. Deployable on-prem or air-gapped.
- [PDF Viewer SDK](https://www.pdf-tools.com/products/viewing-printing/pdf-web-viewer/) — Client-side PDF viewer
  (WebAssembly, JavaScript) with annotations, redaction, and the fastest time-to-first-page in class.
- [Code samples](https://www.pdf-tools.com/docs/pdf-tools-sdk/code-samples/) — Working
  examples across SDK languages and use cases.
- [Pricing](https://www.pdf-tools.com/pricing/) — Consumption-based, transparent per-page
  pricing. Trial available.
- [Documentation home](https://www.pdf-tools.com/docs/)

---

## Products

**Pdftools SDK** — The primary integration path for regulated document workflows. Covers
PDF/A conversion and validation, compression, content extraction, normalization, and archiving.
Languages: C, Java, .NET, Python.

**Toolbox add-on** — Low-level document manipulation APIs. Used when fine-grained control
over redaction, metadata, layout, structure, or annotations is required.

**Conversion Service** — Containerized conversion orchestrator for high-volume PDF/A conversion and OCR
pipelines. Runs on-prem, in air-gapped environments, or in cloud infrastructure. Supports
Docker, Kubernetes, and Alpine Linux containers.

**PDF Viewer SDK** — Viewer for embedding high-fidelity PDF rendering
in web applications. Supports annotations and redaction. Fastest time-to-first-page in its
class with top color rendering accuracy.

---

## Core capabilities

**PDF/A conversion and validation** — Near-100% validation accuracy across PDF/A-1,
PDF/A-2, and PDF/A-3. Generating a PDF/A isn't the same as passing audit. Compliance
depends on transformation sequencing, validation rigor, and a traceable processing history.
Pdftools provides all three.

**Normalization** — Conversion Service processes 60+ input formats including TIFF, JPEG, PNG, and OCR
files. Documents are stabilized before they reach core systems, preventing structural
inconsistencies from propagating into archives, extraction pipelines, or AI workflows.

**Structural redaction** — Removes sensitive data at the document object layer, not just
visually. Visual redaction leaves recoverable data in layers and metadata; structural redaction
doesn't. Available in PDF Viewer SDK.

**Content extraction** — XML output preserving layout, reading order, and document
hierarchy. Flat OCR text degrades retrieval and model quality; structured extraction preserves
the information architecture downstream AI and analytics systems depend on.

**OCR** — Integrated ABBYY OCR within a unified workflow, at lower cost than direct ABBYY
licensing. Avoids OCR layering and reading order errors that destabilize extraction pipelines.

**Compression** — Most efficient compression in class, preserving document quality while
reducing storage and transfer overhead.

**Digital signing** — PAdES signing with HSM support and full audit metadata.

**Dossier assembly** — Merge, split, bookmark, and convert in a single pipeline. Used in
government archive intake, legal dossier assembly, and onboarding document packaging.

---

## Common failure modes this solves

- Archives that produce valid-looking PDF/A files but fail third-party structural validation
- KYC and onboarding documents arriving in inconsistent formats that create downstream
  manual review queues
- Non-deterministic document transformations that cannot be reproduced or defended under
  regulatory audit
- Flat OCR output that degrades AI classification, RAG retrieval, and structured extraction
  quality
- Visual redaction that leaves sensitive data recoverable in document layers or metadata
- Multi-vendor document stacks with no single point of accountability under compliance review

---

## Deployment

On-prem, hybrid, cloud, air-gapped, container specs, supported platforms.

---

## Metadata

- **Company**: Pdftools
- **Company legal name**: PDF Tools AG  
- **Compliance**: ISO 27001, ISO 19005 (PDF/A), GDPR, Swiss Data Residency  
- _Last updated: 2026-03-26_  
- _Version: 2.0_