Split PDF document
The Pdftools SDK lets you split a single input PDF document into multiple output PDF documents and images. During this process, only the resources required by each page are copied to the output document containing that page. This ensures your output PDF files do not contain redundant or potentially sensitive information.
Download the full sample now in C#, Java, C, and Python.
Interested in a C or other language sample? Let us know and we'll add it to our samples backlog1.
- .NET
- Java
Depending on the requirements, you can adjust the characteristics of the output document by setting the PageCopyOptions Class used in the assembly process.
Depending on the requirements, you can adjust the characteristics of the output document by setting the Class PageCopyOptions used in the assembly process.
You can also generate the output documents as images by converting a PDF document to an image.
Steps to split PDF documents:
- Opening the input Document
- Creating the DocumentAssembler object
- Appending to the output document
- Running the Assemble method
- Full example
You need to initialize the library.
Opening the input Document
Read the PDF document you want to convert. To do this, load the input document from the file system into a (read-only) PDF Document
.
- .NET
- Java
// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);
// Open input document
FileStream inStr = new FileStream(inPath, FileStream.Mode.READ_ONLY);
Document inDoc = Document.open(inStr)) {
Creating the DocumentAssembler object
Create the DocumentAssembler
object that will generate the output PDF document. To do this, instantiate the DocumentAssembler
and pass it an output Stream
(for example, a file or memory stream) that will contain the output data.
The following example creates one output PDF document for each input document page.
- .NET
- Java
// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
{
// Create the output stream and pass it to the document assembler
using var outStream = File.Create(outPathPrefix + i + ".pdf");
using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);
// Repeat for each page in the input document
for (int i = 1; i <= inDoc.getPageCount(); ++i) {
// Create the output stream and pass it to the document assembler
FileStream outStream = new FileStream(outPathPrefix + i + ".pdf", FileStream.Mode.READ_WRITE_NEW);
DocumentAssembler docAssembler = new DocumentAssembler(outStream);
Appending to the output document
You can select a page range to copy from the input Document
by passing firstPage
and lastPage
parameters to the Append
method of the DocumentAssembler
object.
In this example, we only append the current page of the input PDF document to each output document.
- .NET
- Java
// Append the current page of the input PDF document to a single-page output document
docAssembler.Append(inDoc, i, i);
// Append the current page of the input PDF document to a single-page output document
docAssembler.append(inDoc, i, i);
Running the Assemble method
After using the Append
method to add the required pages to the output PDF document, the final step is to call the Assemble
method. This method creates the structure of the output PDF document and writes the document to the output Stream
of the DocumentAssember
object.
- .NET
- Java
// Create the final structure of the output PDF document and write it to the output stream
docAssembler.Assemble();
Don't forget that some objects (like the Document
object) must be explicitly closed. For these objects, we recommend using the mechanism for automatically closing objects.
// Create the final structure of the output PDF document and write it to the output stream
docAssembler.assemble();
Don't forget that some objects (like the Document
object) must be explicitly closed. For these objects, we recommend using the mechanism for automatically closing objects.
Full example
- .NET
- Java
// Open input document
using var inStream = File.OpenRead(inPath);
using var inDoc = PdfTools.Pdf.Document.Open(inStream);
// Repeat for each page in the input document
for (int i = 1; i <= inDoc.PageCount; ++i)
{
// Create the output stream and pass it to the document assembler
using var outStream = File.Create(outPathPrefix + i + ".pdf");
using var docAssembler = new PdfTools.DocumentAssembly.DocumentAssembler(outStream);
// Append the current page of the input PDF document to the output document
docAssembler.Append(inDoc, i, i);
// Create the final structure of the output PDF document and write it to the output stream
docAssembler.Assemble();
}
try (
// Open input document
FileStream inStr = new FileStream(inPath, FileStream.Mode.READ_ONLY);
Document inDoc = Document.open(inStr)) {
for (int i = 1; i <= inDoc.getPageCount(); ++i) {
try (
// Create output stream for each page of the input document
FileStream outStream = new FileStream(outPathPrefix + i + ".pdf", FileStream.Mode.READ_WRITE_NEW);
DocumentAssembler docAssembler = new DocumentAssembler(outStream)) {
docAssembler.append(inDoc, i, i);
docAssembler.assemble();
}
}
}