Tag Archives: Document Parser

Extract Data Fields from the Documents using GroupDocs.Parser Product Family

Hello everyone! I am back with something new and exciting for the developers who use to deal with the automated data extraction from the documents. A few years back, we released GroupDocs.Parser API which aimed to extract the text from various document formats. We kept on adding the features to it and today, it has become a giant API that provides a wide range of features including formatted text extraction, highlighted and structured text extraction, metadata extraction, extraction of images …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , , , ,

Support for Text and Presentation Templates in GroupDocs.Parser for Java 18.12

We are delighted to announce the release of GroupDocs.Parser for Java 18.12. The latest version allows you to extract the tables from PDF documents. Furthermore, we have added the support of extracting text and metadata from text and presentation templates. For more details, please have a look at the release notes of version 18.12. Features Introduced Extracting Tables from PDF Documents This feature is very useful when you want to extract only the tables form a PDF document. For extracting …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , ,

Improved Text Area Extraction for PDF Documents in GroupDocs.Parser for Java 18.11

We are delighted to announce the release of GroupDocs.Parser for Java 18.11. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11. Features Introduced Getting Information of Supported Extractors for a Document This feature helps to get the information …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , , ,

Get Information of Supported Extractors for a Document using GroupDocs.Parser for .NET 18.11

We are pleased to announce the release of version 18.11 of GroupDocs.Parser for .NET. The latest version came up with one new feature and three enhancements. It allows you to get information about the supported extractors for a document. Furthermore, we have improved the text area extraction for the PDF documents. For more details, please have a look at the release notes of version 18.11. Features Introduced Getting Information of Supported Extractors for a Document This feature helps to get …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , ,

Introducing Image Extraction in GroupDocs.Parser for Java 18.10

We are delighted to announce the release of GroupDocs.Parser for Java 18.10. The latest release has come with a useful feature of extracting images from the documents. This feature is introduced for PDF, spreadsheet, presentation and text document formats. For more details, please have a look at the release notes of version 18.10. Features Introduced Extracting Images from Documents To extract images from the page of the document, getImageAreas methods are used. getImageAreas has following overloads: public IList getImageAreas(int pageIndex); …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , , ,

Extract Images from the Documents using GroupDocs.Parser for .NET 18.10

We are back with another monthly release of GroupDocs.Parser for .NET. The latest release has come with a powerful feature of extracting images from the pages of the document. This feature is introduced for PDF, spreadsheet, presentation and text document formats. For more details, please have a look at the release notes of version 18.10. Features Introduced Extracting Images from Documents To extract images from the page of the document, GetImageAreas methods are used. GetImageAreas has following overloads: public IList GetImageAreas(int pageIndex); public IList …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , , ,

Extract Text from Databases using GroupDocs.Parser for .NET 18.9

GroupDocs.Parser for .NET 18.9 has been released! The latest version allows you to extract text from the databases. You can also extract data from the form fields in a PDF document. Please continue to read for more details on the features introduced in v18.9. Features Introduced Extracting Text from Databases You can now extract text from the databases. To extract text from databases DbContainer class is used that implements IContainer interface. Each data table is represented by the entity. The content of the entity is CSV-presentation …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , ,

Extract Data from PDF Forms using GroupDocs.Parser for Java 18.9

We are pleased to announce that version 18.9 of GroupDocs.Parser for Java has been released. The latest version allows you to extract data from the form fields in a PDF document. It also supports text analysis API for spreadsheet, presentation and text documents. Furthermore, you can now extract text from the databases. Please see the release notes for more details. Features Introduced Extracting Data from the Form Fields in PDF Documents This feature allows extracting data from PDF forms. It is very …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , , , ,

Text Analysis API for Spreadsheets, Presentations and Text Documents – GroupDocs.Parser for .NET 18.8

We are pleased to announce the release of version 18.8 of GroupDocs.Parser for .NET. In this version, we have extended the support of text analysis API for spreadsheets, presentations and text documents. Furthermore, the latest version allows providing a password for protected documents on-demand. We’d recommend you to use the latest version of the API and share your feedback. Features Introduced Text Analysis API GroupDocs.Parser allows extracting text areas from the pages of a document. This feature may help you getting data …

Continue reading

Posted in GroupDocs.Parser Product Family | Tagged , , , ,