GroupDocs.Parser for Java

We are excited to announce that GroupDocs.Parser is coming soon to Java platform as GroupDocs.Parser for Java. It will be an easy to use back-end API that will permit the users to extractraw and formattedtext from the supported document formats. Besides, it will also allow the users to extract the metadata from the popular document formats. GroupDocs.Parser for Java will soon be available for download.

Salient Features of GroupDocs.Parser for Java

GroupDocs.Parser for Java will come with all the features that are supported by GroupDocs.Parser product family. The most notable features of the API include:

  • Extracting Text from Documents
  • Extracting Formatted Text from Documents
  • Extracting Highlights
  • Extracting Structured Text from Documents
  • Searching a Text
  • Searching the Whole Word
  • Searching Text with a Regular Expression
  • Extracting Metadata from the Documents
  • Working with Containers such as ZIP, OST and Email Containers
  • Encoding Detector
  • Loggers
  • Media Type Detectors

The API will initially support the following document types for text extraction:

  • Text Documents
  • Spreadsheet Documents
  • Presentation Documents
  • PDF Documents
  • Email Messages
  • Markdown Documents
  • Electronic Publication Documents
  • FictionBook Documents
  • Microsoft Compiled HTML Help
  • OneNote Documents

First Version Availability

We are finalizing the first release of GroupDocs.Parser for Java and hoping that you will be able to grab it very soon. Please stay tuned for further updates. We would be happy to hear your queries or suggestions at GroupDocs.Parser forum.