Articles in this section
Category / Section

How to redact all occurrences of a specific text in a PDF file?

2 mins read

The WPF PDF Viewer allows you to redact all occurrences of text in a PDF file. You can use the ExtractText method to find the regions (bounds) of all occurrences of specified text in the PDF file, and the PageRedactor property of PDF Viewer to mark and redact all text regions in the PDF file.

Steps to redact all occurrences of a text in the PDF files

  1. Include the namespaces in the class file.

C#

Using Syncfusion.Windows.PdfViewer;
using System.Collections.Generic;
using System.Drawing;
using System.Windows;
  1. Find the bounds (regions) of all the text occurrences using the ExtractText method. Refer to the following code example to find the bounds of all occurrences of a specific text in the PDF file.

C#

/// <summary>
/// Gets all the bounds of the text present in the PDF file.
/// </summary>
/// <param name="text">text to be searched</param>
/// <returns>The collection of page index and the bounds collection of the searched text</returns>
private Dictionary<int, List<RectangleF>> GetTextBounds(string text)
{
   text = text.ToLower();
 
   Dictionary<int, List<RectangleF>> textBounds = new Dictionary<int, List<RectangleF>>();
 
   for (int i = 0; i < pdfViewer.PageCount; i++)
   {
       List<RectangleF> bounds = new List<RectangleF>();
 
       // Extract text and its bounds from the PDF file.
       List<TextData> textDataCollection = new List<TextData>();
       string extractedText = pdfViewer.ExtractText(i, out textDataCollection).ToLower();
 
       int start = 0;
       int indexOfText = 0;
       int end = extractedText.Length;
       int count = 0;
 
       // Iterate and get all the occurrences of the given text.
       while ((start <= end) && (indexOfText > -1))
       {
           count = end - start;
           // Get the next index of the text to be searched 
           indexOfText = extractedText.IndexOf(text, start, count);
           if (indexOfText == -1)
               break;                  
 
           // Holds the bounds of the first character in the text.
           RectangleF startCharacterBounds = textDataCollection[indexOfText].Bounds;
 
           // Holds the bounds of the last character in the text.
           RectangleF endCharacterBounds = textDataCollection[indexOfText + text.Length - 1].Bounds;
 
           // Get the bounds of the whole text.
           RectangleF rectangle = new RectangleF(startCharacterBounds.X, startCharacterBounds.Y,
                        endCharacterBounds.X - startCharacterBounds.X + endCharacterBounds.Width,
                        startCharacterBounds.Height > endCharacterBounds.Height ? startCharacterBounds.Height : endCharacterBounds.Height);
                    bounds.Add(rectangle);
 
                    start = indexOfText + text.Length;
                }
                // Add to the collection if any text is obtained.
                if (bounds.Count > 0)
                    textBounds.Add(i, bounds);
            }
            return textBounds;
        }
 
  1. Mark the areas that will be redacted. Refer to the following code example to mark the regions for redaction using the text bounds.

C#

        /// <summary>
        /// Marks the rectangle regions to be redacted in the PDF pages
        /// </summary>
        /// <param name="bounds">It has the collection of information about the page index and the bounds of the areas to be redacted</param>
        private void MarkRegions(Dictionary<int, List<RectangleF>> bounds)
        {
            if (bounds.Count > 0)
            {
                // Iterate the collection and mark regions
                foreach (KeyValuePair<int, List<RectangleF>> textBounds in bounds)
                {
                    pdfViewer.PageRedactor.MarkRegions(textBounds.Key, textBounds.Value);
                }
                pdfViewer.PageRedactor.EnableRedactionMode = true;
            }
        }
 
  1. Apply redaction to the marked areas. Refer to the following code to apply redaction using PageRedactor.

C#

            // Apply redaction to the marked bounds.
            pdfViewer.PageRedactor.ApplyRedaction();

 

View sample in GitHub.

 

See also

 

 

 

Did you find this information helpful?
Yes
No
Help us improve this page
Please provide feedback or comments
Comments (0)
Please  to leave a comment
Access denied
Access denied