Split PDF document

The Pdftools SDK lets you split a single input PDF document into multiple output PDF documents and images. During this process, only the resources required by each page are copied to the output document containing that page. This ensures your output PDF files do not contain redundant or potentially sensitive information.

Quick start

Download the full sample now in C#, Java, C, and Python.

Interested in a C or other language sample? Let us know and we'll add it to our samples backlog1.

.NET
Java

Depending on the requirements, you can adjust the characteristics of the output document by setting the PageCopyOptions Class used in the assembly process.

You can also generate the output documents as images by converting a PDF document to an image.

Steps to split PDF documents:

Opening the input Document
Creating the DocumentAssembler object
Appending to the output document
Running the Assemble method
Full example

Before you begin

You need to initialize the library.

Opening the input Document

Read the PDF document you want to convert. To do this, load the input document from the file system into a (read-only) PDF Document.

.NET
Java

// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);

// Open input document
FileStream inStr = new FileStream(inPath, FileStream.Mode.READ_ONLY);
Document inDoc = Document.open(inStr)) {

Creating the DocumentAssembler object

Create the DocumentAssembler object that will generate the output PDF document. To do this, instantiate the DocumentAssembler and pass it an output Stream (for example, a file or memory stream) that will contain the output data.

The following example creates one output PDF document for each input document page.

.NET
Java

// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
    {
        // Create the output stream and pass it to the document assembler
        using var outStream = File.Create(outPathPrefix + i + ".pdf");
        using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);

// Repeat for each page in the input document
for (int i = 1; i <= inDoc.getPageCount(); ++i) {
    // Create the output stream and pass it to the document assembler
    FileStream outStream = new FileStream(outPathPrefix + i + ".pdf", FileStream.Mode.READ_WRITE_NEW);
    DocumentAssembler docAssembler = new DocumentAssembler(outStream);

Appending to the output document

You can select a page range to copy from the input Document by passing firstPage and lastPage parameters to the Append method of the DocumentAssembler object.

In this example, we only append the current page of the input PDF document to each output document.

.NET
Java

// Append the current page of the input PDF document to a single-page output document
docAssembler.Append(inDoc, i, i);

// Append the current page of the input PDF document to a single-page output document
docAssembler.append(inDoc, i, i);

Running the Assemble method

After using the Append method to add the required pages to the output PDF document, the final step is to call the Assemble method. This method creates the structure of the output PDF document and writes the document to the output Stream of the DocumentAssember object.

.NET
Java

// Create the final structure of the output PDF document and write it to the output stream
docAssembler.Assemble();

tip

Don't forget that some objects (like the Document object) must be explicitly closed. For these objects, we recommend using the mechanism for automatically closing objects.

// Create the final structure of the output PDF document and write it to the output stream
docAssembler.assemble();

tip

Don't forget that some objects (like the Document object) must be explicitly closed. For these objects, we recommend using the mechanism for automatically closing objects.

Full example

.NET
Java

// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);

// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
{
    // Create the output stream and pass it to the document assembler
    using var outStream = File.Create(outPathPrefix + i + ".pdf");
    using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);

    // Append the current page of the input PDF document to the output document
    docAssembler.Append(inDoc, i, i);

    // Create the final structure of the output PDF document and write it to the output stream
    docAssembler.Assemble();
}

try (
    // Open input document
    FileStream inStr = new FileStream(inPath, FileStream.Mode.READ_ONLY);
    Document inDoc = Document.open(inStr)) {
    for (int i = 1; i <= inDoc.getPageCount(); ++i) {
        try (
            // Create output stream for each page of the input document
            FileStream outStream = new FileStream(outPathPrefix + i + ".pdf", FileStream.Mode.READ_WRITE_NEW);
            DocumentAssembler docAssembler = new DocumentAssembler(outStream)) {
            docAssembler.append(inDoc, i, i);
            docAssembler.assemble();
        }
    }
}

Opening the input Document​

Creating the DocumentAssembler object​

Appending to the output document​

Running the Assemble method​

Full example​

Opening the input Document

Creating the DocumentAssembler object

Appending to the output document

Running the Assemble method

Full example