Articles in this section
Category / Section

How to convert Webpage to Word document using C#?

4 mins read

Syncfusion® Essential® DocIO is a .NET Word library used to create, read, edit, and convert Word documents programmatically without Microsoft Word or interop dependencies. Using this library, you can convert Webpage to Word document using C#.

Follow the steps below to retrieve HTML content from the specified URLs.

  • Create a web request to the specified URL and set the method to “GET” to retrieve data.
  • Send the request, receive the server’s response, and read the content from the response stream.
  • Read the HTML content from the stream and return it as a string.

Steps to convert Webpage to Word document:

  1. Create a new .NET Core console application project. Create console application in Visual Studio
  2. Install the Syncfusion.DocIO.Net.Core NuGet package as a reference to your project from NuGet.org.
    Add DocIO NuGet package reference to the project

Starting with v16.2.0.x, if you reference Syncfusion® assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to the link to learn about generating and registering a Syncfusion® license key in your application to use the components without trail message.

  1. Include the following namespaces in Program.cs file
    C#
using System.Net;
using Syncfusion.DocIO.DLS;
using Syncfusion.DocIO; 
  1. Use the following code example to convert Webpage to Word document.
    C#
// Request URLs for header, footer, and main body content.
Console.WriteLine("Please enter the URL for the header content:");
string headerHtmlUrl = Console.ReadLine(); 
Console.WriteLine("Please enter the URL for the footer content:");
string footerHtmlUrl = Console.ReadLine(); 
Console.WriteLine("Please enter the URL for the main body content:");
string bodyHtmlUrl = Console.ReadLine();
// Retrieve HTML content from the specified URLs.
string headerContent = GetHtmlContent(headerHtmlUrl);
string footerContent = GetHtmlContent(footerHtmlUrl);
string mainContent = GetHtmlContent(bodyHtmlUrl);
// Create a new Word document instance.
using (WordDocument document = new WordDocument())
{
   // Add a new section to the document.
   WSection section = document.AddSection() as WSection;
   // Append the main content HTML to the paragraph.
   WParagraph paragraph = section.AddParagraph() as WParagraph;
   paragraph.AppendHTML(mainContent);
   // Append the header content HTML to the header paragraph.
   paragraph = section.HeadersFooters.OddHeader.AddParagraph() as WParagraph;
   paragraph.AppendHTML(headerContent);
   // Append the footer content HTML to the footer paragraph.
   paragraph = section.HeadersFooters.OddFooter.AddParagraph() as WParagraph;
   paragraph.AppendHTML(footerContent); 
   // Save the modified document.
   using (FileStream outputStream = new FileStream("Output/Output.docx", FileMode.Create, FileAccess.Write))
   {
       document.Save(outputStream, FormatType.Docx); // Save the document in DOCX format.
   }
}
  1. Use the following helper method to fetch the HTML content from a given URL by sending a GET request and reading the server’s response stream.
    C#
/// <summary>
/// Fetches the HTML content from a given URL by sending a GET request and reading the server's response stream.
/// </summary>
string GetHtmlContent(string url)
{
   // Create a web request to the specified URL.
   WebRequest myRequest = WebRequest.Create(url);
   // Set the request method to GET to fetch data from the URL.
   myRequest.Method = "GET";
   // Get the response from the web server.
   WebResponse myResponse = myRequest.GetResponse();
   // Read the response stream and return the HTML content as a string.
   using (StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8))
   {
       // Read all content from the response stream.
       string result = sr.ReadToEnd();
       // Return the HTML content as a string.
       return result;
   }
}

You can download a complete working sample to convert Webpage to Word document from the GitHub.

By executing the program, you will get the Word document as follows.

Output Word document

Take a moment to peruse the documentation where you can find basic Word document processing options along with the features like mail merge, merge, split, and compare Word documents, find and replace text in the Word document, protect the Word documents, and most importantly, the PDF and Image conversions with code examples.

Conclusion

I hope you enjoyed learning about how to convert Webpage to Word document.

You can refer to our ASP.NET Core DocIO feature tour page to know about its other groundbreaking feature representations and documentation, and how to quickly get started for configuration specifications. You can also explore our ASP.NET Core DocIO example to understand how to create and manipulate data.

For current customers, you can check out our components from the License and Downloads page. If you are new to Syncfusion®, you can try our 30-day free trial to check out our other controls.

If you have any queries or require clarifications, please let us know in the comments section below. You can also contact us through our support forums, Direct-Trac, or feedback portal. We are always happy to assist you!

Did you find this information helpful?
Yes
No
Help us improve this page
Please provide feedback or comments
Comments (0)
Please  to leave a comment
Access denied
Access denied