Search text in PDF
This guide outlines a functionality that requires extensive development effort. Use the readily available features of the PDF Viewer SDK instead of implementing a viewer from scratch. Only use this guide if you want to implement your PDF viewer and customize it as much as possible. To test and implement the primary functionality of this SDK, review Getting started with the PDF Viewer SDK.
The PDF Viewer SDK lets you programmatically search for text within documents. Let users see their searched terms highlighted.
Steps to view a PDF file:
- Initialize the SDK
- Create controller
- Open PDF
- Define the SearchStrategy
- Execute search with the SearchService
- Listen to search events
- Full example
Learn how to get started and make static assets available.
Initialize the SDK
Before instantiating objects and working with PDF Web SDK API, it needs to be initialized.
import { pdfToolsWebSdk, Pdf, UI } from '@pdftools/pdf-web-sdk';
async function initialize() {
await pdfToolsWebSdk.initialize({
path: './pdftools-web-sdk/',
});
}
initialize();
Create controller
A Pdf.Controller
needs to be created to open the desired document. This controller is responsible for processing document events such as rendering pages.
const controller = new Pdf.Controller();
Open PDF
Now the document can be loaded.
const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});
For a pure programmatic approach the document loaded does not necessarily need to be passed to a UI.DocumentView
.
Define the SearchStrategy
To search within a document a SearchStrategy
is required.
It creates the connection between the document to be searched, the search parameters, and the SearchService
.
// create the search strategy
const searchStrategy: Pdf.Search.DocumentTextSearchStrategy = new Pdf.Search.DocumentTextSearchStrategy(pdfDocument);
For searching text the DocumentTextSearchStrategy
is provided ready-to-use, but other custom search strategies can derived from SearchStragey
if needed.
Execute search with the SearchService
// create the search service
const searchService = new Pdf.Search.DocumentTextSearchService(searchStrategy);
// execute the search
const searchExecution = searchService.execute({
'pdf', // query
false, // caseSensitive
false, // regularExpression
});
// wait for search results
const searchResults = await searchExecution.result;
The supported search params passed to DocumentTextSearchService.execute
are defined by the SearchStrategy
used.
In this sample DocumentTextSearchStrategy
is used, which supports passing a query
-string, caseSensitive
-flag, and regularExpression
-flag.
Listen to search events
Instead of waiting until the search was completely finished, listening to search events is a good practice to allow in-time user feedback. The following snippet outlines the supported events and how to listen them.
// convenience function used in the following event listeners
const resultToString = (result: DocumentTextSearchResult) => {
return `'${result.text}' found on page ${result.pageNumber} in '${result.contextText}' `;
};
// listen to searchStarted events
searchExecution.addEventListener('searchStarted', (result) => {
console.log("search started: " + resultToString(result));
});
// listen to searchResultsFound events
searchExecution.addEventListener('searchResultsFound', (result) => {
console.log("search results found: " + resultToString(result));
});
// listen to searchFinished events
searchExecution.addEventListener('searchFinished', (result) => {
console.log("search finished: " + resultToString(result));
});
// listen to searchCanceled events
searchExecution.addEventListener('searchCanceled', () => {
console.log("search canceled");
});
// listen to errorOccurred events
searchExecution.addEventListener('errorOccurred', (error) => {
console.log("search error occurred: " + error);
});
For more details on events learn how to Manage events.
Full example
This full sample demonstrates how to configure and execute a search, and log search results to the console.
import { pdfToolsWebSdk, Pdf, UI } from '@pdftools/pdf-web-sdk';
async function initialize() {
await pdfToolsWebSdk.initialize({
path: './pdftools-web-sdk/',
});
const controller = new Pdf.Controller();
const pdfDocument = await controller.openDocument({
uri: '/pdf/WebViewer_Demo.pdf',
});
// create the search strategy
const searchStrategy: Pdf.Search.DocumentTextSearchStrategy = new Pdf.Search.DocumentTextSearchStrategy(pdfDocument);
// create the search service
const searchService = new Pdf.Search.DocumentTextSearchService(searchStrategy);
// execute the search
const searchExecution = searchService.execute({
'pdf', // query
false, // caseSensitive
false, // regularExpression
});
// wait for search results
const searchResults = await searchExecution.result;
// convenience function used in the following event listeners
const resultToString = (result: DocumentTextSearchResult) => {
return `'${result.text}' found on page ${result.pageNumber} in '${result.contextText}' `;
};
// listen to searchStarted events
searchExecution.addEventListener('searchStarted', (result) => {
console.log("search started: " + resultToString(result));
});
// listen to searchResultsFound events
searchExecution.addEventListener('searchResultsFound', (result) => {
console.log("search results found: " + resultToString(result));
});
// listen to searchFinished events
searchExecution.addEventListener('searchFinished', (result) => {
console.log("search finished: " + resultToString(result));
});
// listen to searchCanceled events
searchExecution.addEventListener('searchCanceled', () => {
console.log("search canceled");
});
// listen to errorOccurred events
searchExecution.addEventListener('errorOccurred', (error) => {
console.log("search error occurred: " + error);
});
}
initialize();