Parse Documents to Extract Text and Metadata using Java

GroupDocs.Parser for Java API is in the market since last year and it is proved to be one of the powerful document parser APIs. It allows parsing and reading popular formats of word processing documents, spreadsheets, presentations, ebooks, emails, markup documents, notes, archives, and databases. Not only the text but you can also extract the images and metadata properties from various document formats including PDF, XLS, XLSX, CSV, DOC, DOCX, PPT, PPTX, MPP, EML, MSG, OST, PST, ONE, and many more.
December 3, 2019 · 3 min · Usman Aziz

Extract Data Fields from the Documents using GroupDocs.Parser Product Family

Hello everyone! I am back with something new and exciting for the developers who use to deal with the automated data extraction from the documents. A few years back, we released GroupDocs.Parser API which aimed to extract the text from various document formats. We kept on adding the features to it and today, it has become a giant API that provides a wide range of features including formatted text extraction, highlighted and structured text extraction, metadata extraction, extraction of images and the list goes on.
June 27, 2019 · 3 min · Usman Aziz

Get Information of Supported Extractors for a Document using GroupDocs.Parser for .NET 18.11

We are pleased to announce the release of version 18.11 of GroupDocs.Parser for .NET. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11. Features Introduced Getting Information of Supported Extractors for a DocumentThis feature helps to get the information about the supported extractors for a document.
November 19, 2018 · 2 min · Usman Aziz