There could be many reasons to replace a word or phrase in the document. Whether you want to erase the sensitive content before publically sharing the document or you want to hide/remove all the private information like email IDs or Social Security Numbers, you need to redact the document content. This article guides you on how to redact Word documents programmatically in your .NET applications using C#. We will separately discuss how to redact by hiding the text and how to find and replace the text, words, or phrases using different techniques.
The following topics are going to be covered below:
- .NET API for Replacing Text
- Find & Replace Words or Phrases
- Case-Sensitive Search and Replace Words or Phrases
- Replace Text using Regular Expressions (RegEx)
- Hide the Text with Colored Box
.NET Redaction API for Replacing Text
GroupDocs.Redaction for .NET is the document redaction API that allows finding and then replacing the intended data from documents of various file formats. Along with the text redaction and rasterization, the API provides metadata, annotation, spreadsheet, and images redaction features. The supported file formats of the Word documents, spreadsheets, presentations, images, and PDF documents are available at the documentation.
You can download the DLLs or MSI installer from the downloads section or install the API in your .NET application via NuGet.
PM> Install-Package GroupDocs.Redaction
There is no need to install MS Office or any other third-party software in this process. Let’s now begin and have a look at different approaches to deal with finding and replacing text in the documents. The following is the screenshot of a Word document that is used in the examples for demonstration. The same methods will work for other document formats without any change in the code.
Find and Replace Words or Phrases in Word document using C#
The following step explains how to find any word/phrase in a Word document and then replaces all the occurrences with some other text within the C# application.
- Load the Word document (DOC/DOCX) using Redactor class.
- Find the exact phrase or word, using the ExactPhraseRedaction class with ReplacementOptions.
- Use Apply method of Redactor to apply redaction.
- Save the changes using the Save method.
The following code finds and replaces the word in C#. More precisely, it replaces all the occurrences of “John Doe” with “[censored]”.
The output of the code is as follows.
Case-Sensitive Search and Replace in Word files using C#
Similarly, you can perform the case-sensitive redaction of a Word document by finding the exact word and replacing it with any other. The following code replaces the existence of the word “John Doe” in a DOCX file using C#, but this time, the search will be case-sensitive.
The output of the code is as follows.
Replace Text in Word Files using Regular Expressions (RegEx) using C#
To find and replace any pattern of text in Word (DOC, DOCX) files you can use regular expressions. The following steps allow you to redact a Word document with RegEx using C#.
- Load the Word document using Redactor class.
- Find the regex match using the RegexRedaction class with ReplacementOptions.
- Use Apply method to replace all the regex match texts.
- Use the Save method to get the redacted Word file.
The following code shows how to find a text pattern in a Word file using RegEx and then replace/hide it with some other text using C#.
The output of the above code is as follows.
Hide Confidential Text in Word Documents with Colored Box using C#
If you do not want to replace your private content but just want to cover it, the API allows you to hide that content by drawing a box over it. The following code places the black rectangle over the intended text to blackout text using C#.
The output of the above code is as follows.
Get a Free API License
You can get a free temporary license in order to use the API without the evaluation limitations.
Conclusion
To conclude, you learned how to find text in Word (DOC, DOCX) files using different techniques and replace the findings in different ways. More precisely, we discussed how to find text, word, or phrase even if it is a case-sensitive search or using a regular expression in C#. Later we replaced the search results with either some other text or by placing the colored rectangle box over the searched text.
For more about the API, visit documentation. For queries, contact us via the forum.