Articles in this section
Category / Section

How to extract text from a PDF file in C#, VB.NET?

2 mins read

Syncfusion Essential PDF is the .NET PDF library used to create, read, and edit PDF documents. Using this library, you can extract text from PDF document.

Essential PDF supports basic text extraction and layout-based extraction.

Steps to extract text in PDF programmatically:

  1. Create a new C# console application project. Create empty Console application in Visual Studio
  2. Install the Syncfusion.Pdf.WinForms  NuGet package as reference to your .NET Framework applications from Install nuget packages
  3. Include the following namespaces in the Program.cs file.


using Syncfusion.Pdf;
using Syncfusion.Pdf.Parsing;



Imports Syncfusion.Pdf;
Imports Syncfusion.Pdf.Parsing;


  1. Use the ExtractText() with true parameter to perform layout based text extraction in the PDF document.


//Extract text from first page
string extractedTexts = page.ExtractText(true);


  1. The following C# and VB.NET code snippets show how to extract text from the PDF document.


//Load an existing PDF
Assembly assembly = typeof(Program).GetTypeInfo().Assembly;
Stream fileStream = assembly.GetManifestResourceStream("ConsoleApplication.input.pdf");
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(fileStream);
//Load first page
PdfPageBase page = loadedDocument.Pages[0];
//Extract text from first page
string extractedTexts = page.ExtractText(true);
//Close the document



'Load an existing PDF
Dim assembly As Assembly = GetType(Program).GetTypeInfo().Assembly
Dim fileStream As Stream = assembly.GetManifestResourceStream("ConsoleApplication.input.pdf")
Dim loadedDocument As PdfLoadedDocument = New PdfLoadedDocument(fileStream)
'Load first page
Dim page As PdfPageBase = loadedDocument.Pages(0)
'Extract text from first page
Dim extractedTexts As String = page.ExtractText(True)
'Close the document


A complete work sample can be downloaded from

The input PDF document is as follows. Input PDF text to be extracted

By executing the program, you will get the extracted text as in the following console window. Text extracted from PDF output

You can go through the documentation, where you will find the basic and layout based text extraction with Essential PDF. Also, the brief details about OCR processing and Image Extraction are available with code examples.

Refer here to explore the rich set of Syncfusion Essential PDF features.

An online sample link to extract text from PDF document.


Starting with v16.2.0.x, if you reference Syncfusion assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to link to learn about generating and registering Syncfusion license key in your application to use the components without trail message.


Did you find this information helpful?
Help us improve this page
Please provide feedback or comments
Comments (0)
Please sign in to leave a comment