Category Archive: GroupDocs.Parser Product Family

Official blog with announcements of latest supported features, hot fixes, technical articles, tips and videos of GroupDocs.Text – A text extraction API for .NET.

Extract Images from EPUB, FB2, CHM eBooks in Java

eBooks of various formats are very common in everyday use. The eBook can contain text as well as images. If you want to use the images of any eBook elsewhere, you can get these easily extracted programmatically within your Java application. In this article, you will learn to automate, how to extract images from eBook files such as EPUB, PDF, FB2, CHM in Java.

The following topics will be covered below:

  • Java API - Image Extraction from eBooks
  • Extract Images from EPUB eBook in Java
  • Extract Images from PDF, FB2, CHM eBooks in Java

Continue Reading ...

Posted in GroupDocs.Parser Product Family | Tagged , , , , ,

Extract Images from EPUB, FB2, CHM eBooks in C#

An electronic book, popularly known as eBook, is a book in digital form that is readable on various electronic devices. These devices include dedicated eReaders like Kindle, or laptops, desktop computers, and smartphones. There are many popular file formats of eBooks in-use in the market that include; EPUB, FictionBook FB2, Microsoft Compiled HTML Help - CHM, DjVu, MOBI, PDF, and many others. As a programmer, this article will help you to programmatically extract images from eBooks in C# within .NET applications.

Extract Images from eBooks in C# .NET
EPUB eBook from the Adobe Sample eBook Library

The following topics will be covered in this article:

  • .NET API for Image Extraction from eBooks
  • Extract Images from EPUB eBook in C#
  • Extract Images from FB2, CHM eBooks in C#

Continue Reading ...

Posted in GroupDocs.Parser Product Family | Tagged , , , , ,

Extract Data from Invoices and Receipts in Java

In the era of online businesses, the use of digital invoices and receipts has largely increased. Similarly, the efficient data extraction from these digital invoices is also demanding. In this article, you will be knowing how to extract data from PDF invoices or receipts programmatically in Java.

Continue Reading...

Posted in GroupDocs.Parser Product Family | Tagged , , ,

Read PDF Form Fields using C#

In this article, we will learn how to read and parse PDF documents and then programmatically extract PDF form field values in C#. Earlier, we have seen how to extract values from PDF forms in Java. After reading these articles, if you have filled feedback forms, you can extract the values within your .NET & Java applications for analysis or save them in the database.

Parse PDF Forms to Extract values in C#

Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged , ,

Read PDF Form Fields in Java

In this article, we will discuss how to parse PDF document and extract values from PDF forms programmatically in Java. There are many situations, where we have several filled survey forms or feedbacks in PDF format from a large audience. We can easily extract the filled data values and use them for analysis. Let us now move straight towards reading these PDF forms and extract filled data field values within Java applications.

Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged , ,

Extract Images from Documents using C#

In this article, we will be learning to programmatically extract images from PDF, Excel, PowerPoint, and Word documents in a C# application using document parsing .NET API.

GroupDocs.Parser for .NET is document parsing and data extraction .NET API. It supports document parsing and extraction of images, text, and metadata from word-processing documents, spreadsheets, presentations, archives, and email documents.

Extracted images can be saved in BMP, GIF, JPEG, PNG, and WebP formats.

Posted in GroupDocs.Parser Product Family | Tagged , , , ,

Extract Images from Documents using Java

Today, we will learn to programmatically extract images from PDF, Excel, PowerPoint, and Word documents using Java. For the extraction of images, we will use GroupDocs.Parser for Java. This Java API supports the parsing of documents and extraction of images, text, and metadata from word-processing documents, spreadsheets, presentations, archives, and email documents. Extracted images can be saved in BMP, GIF, JPEG, PNG, and WebP formats.

Following topics will be covered in this article:
  • Image Extraction Java API
  • Image Extraction from PDF documents in Java
  • Extract Images from Word, Excel, PowerPoint documents in Java
  • Extract Image from Specific Page in Java
Posted in GroupDocs.Parser Product Family | Tagged , , ,

Extract Data from Database Files using C#

The database is considered to be an integral part of most of the applications. Be it a desktop, web or mobile application, database plays a vital role in storing, accessing and manipulating the data. There are many database management systems that allow creating and managing databases for you.

However, there could be a scenario when you need a way to extract data from database files, i.e. .db file, without installing a database management system or writing the SQL queries. How … Continue Reading

Posted in GroupDocs.Parser Product Family | Tagged , ,

Extract Data from Invoices or Receipts in C#

Invoices and receipts are the documents that are used to record the transactions in a particular format when buying or selling of the services or goods is involved. Things have gone digital and with the popularity of online shopping, digital invoices are widely used. Processing a number of digital invoices and extracting the information manually is a complex as well as time taking process. Thus, you need a faster yet efficient way for such a case. So in this article, … Continue Reading

Posted in GroupDocs.Parser Product Family |