Articles in this section
Category / Section

How to find start and end markers and iterate between them in a Word document?

7 mins read

Syncfusion® Essential® DocIO is a .NET Word library used to create, read, edit, and convert Word documents programmatically without Microsoft Word or interop dependencies. Using this library, you can find start and end markers and iterate between them in a Word document using C#.

To achieve this, locate the paragraphs containing the start and end markers in the Word document, then determine their exact indexes within the document structure. Using these indexes, iterate through all the paragraphs between them to extract or process the desired content.

Steps to find start and end markers and iterate between them in a Word document:

  1. Create a new .NET Core console application project.
    Create console application in Visual Studio
  2. Install the Syncfusion.DocIO.Net.Core NuGet package as a reference to your project from NuGet.org.
    Add DocIO NuGet package reference to the project
Note:

Starting with v16.2.0.x, if you reference Syncfusion® assemblies from a trial setup or from the NuGet feed, include a license key in your projects. Refer to the link to learn about generating and registering a Syncfusion® license key in your application to use the components without a trial message.

  1. Include the following namespaces in the Program.cs file.
    C#
using Syncfusion.DocIO.DLS;
using Syncfusion.DocIO; 
  1. Use the following code example to find start and end markers and iterate between them in a Word document.
    C#
// Load the Word document using Syncfusion.DocIO
using (WordDocument document = new WordDocument(Path.GetFullPath(@"Data/Template.docx")))
{
    // Define an array of texts to find within the document
    string[] textsToFind = new string[2] { "GIANT START", "GIANT END" };

    // Iterate through each text in the array and call FindStartEndAndIterate to process them
    foreach (string textToFind in textsToFind)
    {
        FindStartEndAndIterate(document, textToFind);
    }
}
  1. Use the following code example to find the start and end paragraphs based on the search text, then iterates through the paragraphs in the specified range.
    C#
void FindStartEndAndIterate(WordDocument document, string textToSearch)
{
    WParagraph startPara = null;
    WParagraph endPara = null;
    WSection startSection = null;
    WSection endSection = null;
    string endText = "GIANT"; // Text to mark the end of the range

    // Step 1: Find the start and end paragraphs by iterating through each section of the document
    foreach (WSection section in document.Sections)
    {
        foreach (Entity entity in section.Body.ChildEntities)
        {
            // Check if the entity is a paragraph
            if (entity is WParagraph paragraph)
            {
                string paraText = paragraph.Text;

                // If the paragraph starts with the 'textToSearch', set it as the start paragraph
                if (paraText.StartsWith(textToSearch))
                {
                    startPara = paragraph;
                    startSection = section;
                }
                // If the paragraph contains the 'endText', set it as the end paragraph
                else if (paraText.Contains(endText))
                {
                    endPara = paragraph;
                    endSection = section;
                    break; // Stop once the end paragraph is found
                }
            }
        }
        // If both start and end paragraphs have been found, break out of the loop
        if (startPara != null && endPara != null)
            break;
    }
    // If no start or end paragraphs were found, exit the method
    if (startPara == null || endPara == null)
        return;
    // Get the index of the start paragraph within its section's body and the section itself
    int startBodyIndex = startSection.Body.ChildEntities.IndexOf(startPara);
    int startSectionIndex = document.Sections.IndexOf(startSection);
    // Get the index of the end paragraph within its section's body and the section itself
    int endBodyIndex = endSection.Body.ChildEntities.IndexOf(endPara);
    int endSectionIndex = document.Sections.IndexOf(endSection);

    // Step 2: Loop through the sections from the start section to the end section
    for (int sectionIndex = startSectionIndex; sectionIndex <= endSectionIndex; sectionIndex++)
    {
        // Get the current section from the document
        WSection currentSection = document.Sections[sectionIndex];
        // Determine the starting index of body entities in this section
        // If it's the first section, start from startBodyIndex; otherwise, start from the beginning
        int start = (sectionIndex == startSectionIndex) ? startBodyIndex : 0;
        // Determine the ending index of body entities in this section
        // If it's the last section, end at endBodyIndex; otherwise, end at the last entity
        int end = (sectionIndex == endSectionIndex) ? endBodyIndex : currentSection.Body.ChildEntities.Count - 1;
        
        // Step 3: Loop through the paragraphs from start to end within the current section
        for (int paraIndex = start; paraIndex <= end; paraIndex++)
        {
            // Get the current entity in the section's body
            Entity currEntity = currentSection.Body.ChildEntities[paraIndex];

            // Check if the current entity is a paragraph
            if (currEntity is WParagraph currentPara)
            {
                // Print the paragraph text to the console
                Console.WriteLine(currentPara.Text);
            }
        }
    }
}

You can download a complete working sample to find start and end markers and iterate between them in a Word document from GitHub.

Input Word document

Input Word document

Console output displaying the extracted content between the start and end markers

Console output displaying the extracted content between the start and end markers

Take a moment to peruse the documentation where you can find basic Word document processing options along with features like mail merge, merge, split, and compare Word documents, find and replace text in the Word document, protect the Word documents, and most importantly, the PDF and Image conversions with code examples.

Conclusion

I hope you enjoyed learning about how to find start and end markers and iterate between them in a Word document in a .NET Core Word document.

You can refer to our ASP.NET Core DocIO feature tour page to know about its other groundbreaking feature representations and documentation, and how to quickly get started with configuration specifications. You can also explore our ASP.NET Core DocIO example to understand how to create and manipulate data.

For current customers, you can check out our components from the License and Downloads page. If you are new to Syncfusion®, you can try our 30-day free trial to check out our other controls.

If you have any queries or require clarifications, please let us know in the comments section below. You can also contact us through our support forums, Direct-Trac, or feedback portal. We are always happy to assist you!

Did you find this information helpful?
Yes
No
Help us improve this page
Please provide feedback or comments
Comments (0)
Please  to leave a comment
Access denied
Access denied