Skip to main content
Version: Version 5

Read PDF information

Extensive development effort required

This guide outlines a functionality that requires extensive development effort. Use the readily available features of the PDF Viewer SDK instead of implementing a viewer from scratch. Only use this guide if you want to implement your PDF viewer and customize it as much as possible. To test and implement the primary functionality of this SDK, review Getting started with the PDF Viewer SDK.

Learn how to read information from a PDF document using the PDF Web SDK.

The PDF Web SDK lets you access document information without instantiating a viewer and without the need of a server to process the document. Use this functionality to display document information such as the author or page count to the user before rendering the document (for example, in a file chooser).

Steps to read information from a PDF document:

  1. Initialize the SDK
  2. Create controller
  3. Open PDF
  4. Read document information programmatically
  5. Full example

Before you begin

Initialize the SDK

Before instantiating objects and working with PDF Web SDK API, it needs to be initialized.

import { pdfToolsWebSdk, Pdf, UI } from '@pdftools/pdf-web-sdk';

async function initialize() {
await pdfToolsWebSdk.initialize({
path: './pdftools-web-sdk/',
});
}

initialize();

Create controller

A Pdf.Controller needs to be created to open the desired document. This controller is responsible for reading document information.

const controller = new Pdf.Controller();

Open PDF

Finally the document can be loaded to be later used to access the document information.

const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});

Read document information programmatically

The document loaded via the Pdf.Controller has several properties that can be used to access document information. Some of them are directly accessible, others are stored in the metadata.

// properties stored in the metadata
const metadata = await pdfDocument.getMetadata();
console.log(`title = ${metadata.title}`);
console.log(`author = ${metadata.author}`);
console.log(`creation date = ${metadata.creationDate}`);

// direct properties
console.log(`page count = ${pdfDocument.pageCount}`);
warning

The Metadata API is still under development and not all information available in the PDF might already be available on the API.

tip

For a full list of information refer to the api-references for Pdf.Document and Pdf.Metadata.

Full example

This full sample demonstrates how to open a document and log document information to the console.

import { pdfToolsWebSdk, Pdf } from '@pdftools/pdf-web-sdk';

pdfToolsWebSdk.initialize().then(async () => {
const controller = new Pdf.Controller();
const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});

// properties stored in the metadata
const metadata = await pdfDocument.getMetadata();
console.log(`title = ${metadata.title}`);
console.log(`author = ${metadata.author}`);
console.log(`creation date = ${metadata.creationDate}`);

// direct properties
console.log(`page count = ${pdfDocument.pageCount}`);
});