Find and Replace Words in Documents using C#

There could be many reasons to replace a word or phrase in the document. Whether you want to erase the sensitive content before publically sharing the document or you want to hide/remove all the private information like email IDs or Social Security Numbers, you need to redact the document content. This article guides you on how to redact PDF, Word, or other documents programmatically in your .NET applications using C#. We will separately discuss how to redact by hiding the text and how to find and replace the text, words, or phrases using different techniques.

The following topics are going to be covered below:

.NET Redaction API for Replacing Text

GroupDocs.Redaction for .NET is the document redaction API that allows finding and then replace the intended data from documents of various file formats. Along with the text redaction and rasterization, the API provides metadata, annotation, spreadsheet, and images redaction features. The supported file formats of the Word documents, spreadsheets, presentations, images, and PDF documents are available at the documentation.

You can download the DLLs or MSI installer from the downloads section or install the API in your .NET application via NuGet.

PM> Install-Package GroupDocs.Redaction

There is no need for MS Office, PDF editor, or any other third-party software in this process. Let’s now begin and have a look at different approaches to deal with finding and replacing text in the documents. The following is the screenshot of a Word document that is used in the examples for demonstration. The same methods will work for other document formats without any change in the code.

Find and Replace a Word or Phrase using C#

The following step explains how to find any word/phrase in a Word, PDF, or other document and then replaces all the occurrences with some other text within the C# application.

The following code finds and replaces the word in C#. More precisely, it replaces all the occurrences of “John Doe” with “[censored]”.

The output of the code is as follows.

Find and Replace Case-Sensitive Word or Phrase using C#

Similarly, you can perform the case-sensitive redaction by finding the exact word and replacing it with any other. The following code replaces the existence of the word “John Doe” in C#, but this time, the search will be case-sensitive.

The output of the code is as follows.

Replace Text using Regular Expressions (RegEx) in C#

To find and replace any pattern of text you can use regular expressions. The following steps allow you to redact using regular expression (RegEx) with .NET application.

The following code shows how to find the pattern using RegEx and replace/hide it with some other text using C#.

The output of the above code is as follows.

Replace the Text with Colored Box in C#

If you do not want to replace your private content but just want to cover it, the API allows you to hide that content by drawing a box over it. The following code places the black rectangle over the intended text using C#.

The output of the above code is as follows.

Get a Free API License

You can get a free temporary license in order to use the API without the evaluation limitations.

Conclusion

To conclude, you learned how to find text using different techniques and replace the findings in different ways within MS Word, PDF, and other documents. More precisely, we discussed how to find text, word to phrase even if it is a case-sensitive search or using a regular expression in C#. Later we replaced the search results with either some other text or by placing the colored rectangle box over the searched text. For more details or learning about the API, visit documentation. For queries, contact us via the forum.

See Also