Get Information of Supported Extractors for a Document using GroupDocs.Parser for .NET 18.11

GroupDocs.Parser for .NET

We are pleased to announce the release of version 18.11 of GroupDocs.Parser for .NET. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11.

Features Introduced

Getting Information of Supported Extractors for a Document

This feature helps to get the information about the supported extractors for a document. For example, you can check if you can extract the plain text, formatted text, and metadata from a particular document. Furthermore, you can also check if the document is a container that contains other documents in it.

For working example of this feature, please refer to this documentation article.

Enhancements

IFastTextExtractor Interface

GroupDocs.Parser allows changing the default behavior of text extraction. By default, the text is extracted using the Standard Extract mode. In Standard Extract mode, the text is extracted with better quality but it takes more time. This enhancements allows setting the fast text extraction via IFastTextExtractor interface. The support for IFastTextExtractor interface is added to the following classes:

  • PdfTextExtractor 
  • CellsTextExtractor 
  • SlidesTextExtractor 

For working example of this feature, please refer to this documentation article.

IDocumentContentExtractor Interface

This enhancement allows getting the access to Text Analysis API via IDocumentContentExtractor interface. The support for IDocumentContentExtractor interface is added to the following classes:

  • PdfTextExtractor 
  • CellsTextExtractor 
  • SlidesTextExtractor
  • WordsTextExtractor

For working example of this feature, please refer to this documentation article.

Improved Text Area Extraction for PDF Documents

This enhancement improves the text area extraction for PDF documents. In the latest version, the Y-coordinates of text areas start from the top of the page.

Available Channels and Resources

Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.Parser:

Have Queries?

If you have got any queries or concerns about the API, please feel free to get in touch with us over the forum. We’ll be glad to address your concerns.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

To keep up with our news, you can follow us on Twitter or follow our Facebook page.