How to extract text from Word document in C# and VB?
Syncfusion Essential DocIO is a .NET Word library used to create, read, and edit Word documents programmatically without Microsoft Word or Interop dependencies. Using this library, you can extract the text from Word document and save as another new Word document or plain text file.
This article will show you how to use the method of document.GetText() to extract the text directly from the word documents in WinForms using Syncfusion Word library (Essential DocIO) in C# and VB.NET.
Steps to extract text from Word document in C#
- Create a new C# console application project.
- Install Syncfusion.DocIO.WinForms NuGet package as a reference to your .NET Framework applications from the NuGet.org.
- Include the following namespace in the Program.cs file.
C#
using Syncfusion.DocIO; using Syncfusion.DocIO.DLS;
VB
Imports Syncfusion.DocIO Imports Syncfusion.DocIO.DLS
- Use the following code to extract text from Word document.
C#
//Loads a template document WordDocument document = new WordDocument("Template.docx"); //Gets the document text string text = document.GetText(); //Write the text collection to a text file System.IO.File.WriteAllText("Sample.txt", text); //Dispose the document instance document.Close();
VB
'Loads a template document Dim document As WordDocument = New WordDocument("Template.docx") 'Gets the document text Dim text As String = document.GetText() 'Write the text collection to a text file System.IO.File.WriteAllText("Sample.txt", text) 'Dispose the document instance document.Close()
A complete working example of extracting text from Word document in C# can be downloaded from here.
Input template Word document as follows:
By executing the program, you will get the output text file as follows:
Take a moment to peruse the documentation, where you can find basic Word document processing options along with features like mail merge, merge and split documents, find and replace text in the Word document, protect the Word documents, and most importantly PDF and Image conversions with code examples.
Explore more about the rich set of Syncfusion Word Framework features.
See Also:
Text Conversions in Word Library
Starting with v16.2.0.x, if you reference Syncfusion assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to link to learn about generating and registering Syncfusion license key in your application to use the components without trail message.
Conclusion
I hope you enjoyed learning about how to extract text from Word document in C# and VB.
You can refer to our WinForms feature tour page to know about its other groundbreaking feature representations and documentation, and how to quickly get started for configuration specifications. You can also explore our PDF example to understand how to create and manipulate data in the .NET PDF.
For current customers, you can check out our Document processing libraries from the License and Downloads page. If you are new to Syncfusion, you can try our 30-day free trial to check out our WinForms File Format and other WinForms controls.
If you have any queries or require clarifications, please let us know in the comments section below. You can also contact us through our support forums, Direct-Trac, or feedback portal. We are always happy to assist you!