Skip to main content
Version: Version 5

Search text in PDF

Extensive development effort required

This guide outlines a functionality that requires extensive development effort. Use the readily available features of the PDF Viewer SDK instead of implementing a viewer from scratch. Only use this guide if you want to implement your PDF viewer and customize it as much as possible. To test and implement the primary functionality of this SDK, review Getting started with the PDF Viewer SDK.

The PDF Viewer SDK lets you programmatically search for text within documents. Let users see their searched terms highlighted.

Steps to view a PDF file:

  1. Initialize the SDK
  2. Create controller
  3. Open PDF
  4. Define the SearchStrategy
  5. Execute search with the SearchService
  6. Listen to search events
  7. Full example

Before you begin

Initialize the SDK

Before instantiating objects and working with PDF Web SDK API, it needs to be initialized.

import { pdfToolsWebSdk, Pdf, UI } from '@pdftools/pdf-web-sdk';

async function initialize() {
await pdfToolsWebSdk.initialize({
path: './pdftools-web-sdk/',
});
}

initialize();

Create controller

A Pdf.Controller needs to be created to open the desired document. This controller is responsible for processing document events such as rendering pages.

const controller = new Pdf.Controller();

Open PDF

Now the document can be loaded.

const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});
note

For a pure programmatic approach the document loaded does not necessarily need to be passed to a UI.DocumentView.

Define the SearchStrategy

To search within a document a SearchStrategy is required. It creates the connection between the document to be searched, the search parameters, and the SearchService.

// create the search strategy
const searchStrategy: Pdf.Search.DocumentTextSearchStrategy = new Pdf.Search.DocumentTextSearchStrategy(pdfDocument);
tip

For searching text the DocumentTextSearchStrategy is provided ready-to-use, but other custom search strategies can derived from SearchStragey if needed.

Execute search with the SearchService

// create the search service
const searchService = new Pdf.Search.DocumentTextSearchService(searchStrategy);

// execute the search
const searchExecution = searchService.execute({
'pdf', // query
false, // caseSensitive
false, // regularExpression
});

// wait for search results
const searchResults = await searchExecution.result;
tip

The supported search params passed to DocumentTextSearchService.execute are defined by the SearchStrategy used. In this sample DocumentTextSearchStrategy is used, which supports passing a query-string, caseSensitive-flag, and regularExpression-flag.

Listen to search events

Instead of waiting until the search was completely finished, listening to search events is a good practice to allow in-time user feedback. The following snippet outlines the supported events and how to listen them.

// convenience function used in the following event listeners
const resultToString = (result: DocumentTextSearchResult) => {
return `'${result.text}' found on page ${result.pageNumber} in '${result.contextText}' `;
};

// listen to searchStarted events
searchExecution.addEventListener('searchStarted', (result) => {
console.log("search started: " + resultToString(result));
});

// listen to searchResultsFound events
searchExecution.addEventListener('searchResultsFound', (result) => {
console.log("search results found: " + resultToString(result));
});

// listen to searchFinished events
searchExecution.addEventListener('searchFinished', (result) => {
console.log("search finished: " + resultToString(result));
});

// listen to searchCanceled events
searchExecution.addEventListener('searchCanceled', () => {
console.log("search canceled");
});

// listen to errorOccurred events
searchExecution.addEventListener('errorOccurred', (error) => {
console.log("search error occurred: " + error);
});
tip

For more details on events learn how to Manage events.

Full example

This full sample demonstrates how to configure and execute a search, and log search results to the console.

import { pdfToolsWebSdk, Pdf, UI } from '@pdftools/pdf-web-sdk';

async function initialize() {
await pdfToolsWebSdk.initialize({
path: './pdftools-web-sdk/',
});

const controller = new Pdf.Controller();
const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});

// create the search strategy
const searchStrategy: Pdf.Search.DocumentTextSearchStrategy = new Pdf.Search.DocumentTextSearchStrategy(pdfDocument);

// create the search service
const searchService = new Pdf.Search.DocumentTextSearchService(searchStrategy);

// execute the search
const searchExecution = searchService.execute({
'pdf', // query
false, // caseSensitive
false, // regularExpression
});

// wait for search results
const searchResults = await searchExecution.result;

// convenience function used in the following event listeners
const resultToString = (result: DocumentTextSearchResult) => {
return `'${result.text}' found on page ${result.pageNumber} in '${result.contextText}' `;
};

// listen to searchStarted events
searchExecution.addEventListener('searchStarted', (result) => {
console.log("search started: " + resultToString(result));
});

// listen to searchResultsFound events
searchExecution.addEventListener('searchResultsFound', (result) => {
console.log("search results found: " + resultToString(result));
});

// listen to searchFinished events
searchExecution.addEventListener('searchFinished', (result) => {
console.log("search finished: " + resultToString(result));
});

// listen to searchCanceled events
searchExecution.addEventListener('searchCanceled', () => {
console.log("search canceled");
});

// listen to errorOccurred events
searchExecution.addEventListener('errorOccurred', (error) => {
console.log("search error occurred: " + error);
});
}

initialize();