Index each letter as a separate word using GroupDocs.Search for .NET

Are you looking for a full-text search API that allows you to search over a lot of document formats? In that case, GroupDocs.Search for .NET will meet your requirements. API creates index and then perform instant search across thousands of documents.

Those who are already working with the API, we have some new features and improvements. Moreover, some classes have been renamed to improve code readability. There are minor changes in the new version 19.10, so the migration will not be too difficult. API architecture is optimized for better performance.
After upgrading to v19.10, you need to replace the namespace usage across the entire project from GroupDocs.Search to GroupDocs.Search.Legacy to resolve build issues.

Lets go though the code changes:
Old code sample:

New code snippet:

You can observe the minor changes (e.g. SearchParameters is changed to
SearchOptions).

Improvements

  • Highlight search results in short fragments
  • Enhance document metadata indexing with new formats

New Features

  • Index each letter as a separate word
  • Implemented ability to remove paths from index

How to highlight search results in short fragments?
This improvement allows highlighting the search results in separate short fragments of the text, and not in the whole document. Below example shows how to generate short HTML snippets with highlighted found terms:

How to enhance document metadata indexing with new formats?
This improvement adds support for new document formats. These are mostly documents, the main content of which is not textual, therefore only the metadata of these documents is indexed:

  • MP3 – MPEG-2 Audio Layer III;
  • WAV – Waveform Audio File Format;
  • BMP – Bitmap Picture;
  • GIF – Graphical Interchange Format File;
  • JP2 – JPEG 2000 Core Image File;

For complete list visit this article.

How to index each letter as a separate word?
This feature is designed to work with hieroglyphic languages and allows you to index each character in the text as a separate word, regardless of the presence of separators.

Ability to remove paths from index
When indexed paths are removed from an index, the index is updated and all removed documents and folders become inaccessible for search.


We’d recommend you to download the latest version and share your experience. In case of any issues, you can post on forum.



Share on FacebookTweet about this on TwitterShare on LinkedIn