Usually, a searching software is able to achieve fast search responses because, instead of search the text directly, it searches an index. This would be the equivalent of retrieving pages in a book related to a keyword by searching the index at the back of a book, as opposed to searching the words in each page of the book.
Using GroupDocs.Search for Indexing and Searching
Problem: Suppose you have 10 million documents of different file formats, e.g. MS Word, Spreadsheets, Presentations, etc. Due to limited memory size, you cannot store more than 5% of the entire data. Now the main issue is how to apply indexing and searching in this case.
Solution: The GroupDocs.Search for .NET provides many ways to perform search operations on any size of document collections. It is capable in indexing various types of documents and perform searches on it. The API supports searching for:
- Text occurrences
- Basic metadata fields
- File names
- Document types
Moreover, it allows searching on the basis of different search query types. The advanced search (e.g fuzzy search, synonyms search, boolean search) is also supported.
Creating Index
Let’s try the GroupDocs.Search API for indexing the bulk of documents of different file formats(see the supported formats list). Although, the Index can be created in memory, but here, let’s create it on disk. You just need to follow these simple steps:
- Create a directory for Indexing
- Create another directory and copy all the required documents into it.
- Come to the code and firstly initialize Index object by passing the path of the index directory
- Add documents using AddToIndex(“Documents_Folder_Path”) method of Index object.
The C# code will look like this:
// For complete examples and data files, please go to https://github.com/groupdocs-search/GroupDocs.Search-for-.NET | |
// Create index | |
Index index = new Index(Utilities.indexPath); | |
// all files from folder and its subfolders will be added to the index | |
index.AddToIndex(Utilities.documentsPath); |
Java guys can write the code like this:
// For complete examples and data files, please go to https://github.com/groupdocs-search/GroupDocs.Search-for-Java | |
// Create index | |
Index index = new Index(Utilities.INDEX_PATH); | |
// Add all files from folder and its subfolders to the index | |
index.addToIndex(Utilities.DOCUMENTS_PATH); |
After creating Index you will see the files in the Index folder like following screenshot:

Searching The Terms
The GroupDocs.Search allows various kinds of queries for search operations with more advance features. Please see this article for the detail.
Lets come to the code.
Suppose, the index has been already created as described in the above section. Let’s simply search a term. Follow the steps as written below:
- Instantiate _Index _by passing index folder path
- Search the term using Index.Search() method which will return SearchResults object.
- Show list of searched files
The C# code will look like:
// For complete examples and data files, please go to https://github.com/groupdocs-search/GroupDocs.Search-for-.NET | |
// Loading index | |
Index index = new Index(Utilities.booksIndex); | |
SearchResults searchResults = index.Search("Gregor Samsa"); | |
// List of found files | |
foreach (DocumentResultInfo result in searchResults) | |
{ | |
Console.WriteLine((result.FileName)); | |
} |
Java developers can write the code like this:
// For complete examples and data files, please go to https://github.com/groupdocs-search/GroupDocs.Search-for-Java | |
// Loading index | |
Index index = new Index(Utilities.BOOKS_INDEX); | |
SearchResults searchResults = index.search("Gregor Samsa"); | |
// List of found files | |
for (DocumentResultInfo result : searchResults) { | |
System.out.println(result.getFileName()); | |
} |
The output will be appeared like the following screenshot:

The complete ready to run code sample is available on GitHub.