Support of .NET Standard 2.0 in GroupDocs.Viewer for .NET 19.10

Hello everyone! We have released v19.10 of GroupDocs.Viewer for .NET and I have a piece of breaking news especially for those who requested the support of .NET Core. The latest release of the API includes 6 new features, 2 improvements, and 11 bug fixes. So let’s have a look at some major updates we have brought for you.

Support of .NET Standard 2.0

As promised, we have added the support of .NET Standard 2.0 in v19.10. So now, the API supports the .NET implementations that target .NET Standard 2.0 inlcuding .NET Framework and .NET Core. Thus, you can use GroupDocs.Viewer for .NET API in your cross-platform applications. At the moment, there exist a few limitations when using the API in a non-Windows environment. You can see the details of the limitations and the possible recommendations here.

Adjusting Page Size when Rendering Emails as HTML

Previously, the feature of setting page size for email messages was available only for image-based rendering. However, we have extended it for HTML-based rendering as well. Now, you can control the page size of the resultant HTML pages as shown in the following code sample:

Support of .gzip and .sxc Formats

In the latest release. we have extended the list of our supported file formats and added the support of Gnu Zipped Files (.gzip) and StarOffice Calc Spreadsheets (.sxc).

Bug Fixes

The following bugs that were reported in earlier versions have been fixed.

  • Tiff files are rendered incorrectly.
  • Incorrect image URLs when rendering email as HTML.
  • Blur image in when rendering slides as HTML.
  • Rendering Word document is taking a long time.
  • External resources failed to load when rendering Email messages.
  • Styles are lost when rendering XLSX into HTML.
  • The print preview of the rendered HTML is zoomed in.
  • Exception when rendering Word document as HTML.
  • Rendering Diagram document provides improper output colors.
  • “A null reference or invalid value was found” exception when rendering DWG as HTML.
  • DWG rendered empty.

Other Improvements

  • Improved rendering of Markdown Documentation File (*.md) file format.
  • Fit content by width when rendering mail messages into PDF/JPG/PNG.

Have a look at the release notes for more details about the API changes in v19.10. We have also updated the source code examples on the GitHub repository for the latest release and added the sample project for .NET Core. In case you would have any questions or queries, do let us know on our forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Viewer Product Family | Tagged , , , , | Leave a comment

How To Remove Watermark from PDF Documents

Remove Watermark from PDF – If you are looking to create an application to delete watermark from a PDF document. This article is a good learning source for the c# programming developers.

Significance of removing watermarks from PDF documents

Remove Watermark from PDF

Watermarks importance is very high for all those who intends to ascertain their copyrights if in case their data gets copied somewhere. The information contained within the watermark designates the authority of the owner and prevents another users from deleting the copyright. But at times, it really becomes a troublesome for many users out there as you need the information on an urgent basis.

Although, a variety of utilities, paid or free online tools might be available to get your problem resolved. The following section shows that how C# Programming developers can do it through their coding editors.

Delete watermark using C# language

The GroupDocs.Watermark is an API to perform watermarking operations on the images or the documents of different file formats. If you are making a watermark remover app, It provides you some useful ways to remove all watermarks, remove watermark with particular text formatting or remove hyperlink watermarks

Lets learn how a C# developer can remove watermark from PDF using GroupDocs.Watermark for .NET API.

Remove all occurrences of watermarks from a PDF document

The GroupDocs.Watermark API enables you to easily find and remove a particular watermark from a document. Following code serves this purpose.

// Constants.InDocumentPdf is an absolute or relative path to your document. Ex: @"C:\Docs\document.pdf"
using (Watermarker watermarker = new Watermarker(Constants.InDocumentPdf))
{
    PossibleWatermarkCollection possibleWatermarks = watermarker.Search();
 
    // Remove possible watermark at the specified index from the document.
    possibleWatermarks.RemoveAt(0);
 
    // Remove specified possible watermark from the document.
    possibleWatermarks.Remove(possibleWatermarks[0]);
 
    watermarker.Save(Constants.OutDocumentPdf);
}

Remove watermark with particular text formatting

The API also enables you to search and remove the watermarks on the basis of some particular text formatting. You can provide a search criterion containing font name, size, color etc and the API will find the watermarks with matching properties. Following code snippet shows how to search and remove watermarks with a particular text formatting.

// Constants.InDocumentPdf is an absolute or relative path to your document. Ex: @"C:\Docs\document.pdf"
using (Watermarker watermarker = new Watermarker(Constants.InDocumentPdf))
{
    TextFormattingSearchCriteria criteria = new TextFormattingSearchCriteria();
    criteria.ForegroundColorRange = new ColorRange();
    criteria.ForegroundColorRange.MinHue = -5;
    criteria.ForegroundColorRange.MaxHue = 10;
    criteria.ForegroundColorRange.MinBrightness = 0.01f;
    criteria.ForegroundColorRange.MaxBrightness = 0.99f;
    criteria.BackgroundColorRange = new ColorRange();
    criteria.BackgroundColorRange.IsEmpty = true;
    criteria.FontName = "Arial";
    criteria.MinFontSize = 19;
    criteria.MaxFontSize = 42;
    criteria.FontBold = true;
 
    PossibleWatermarkCollection watermarks = watermarker.Search(criteria);
    watermarks.Clear();
 
    watermarker.Save(Constants.OutDocumentPdf);
}

Remove hyperlink watermarks 

GroupDocs.Watermark API allows you to search and remove hyperlinks in a document of any supported format. Following code sample shows how to find and remove hyperlinks with a particular URL from a document.

// Constants.InDocumentPdf is an absolute or relative path to your document. Ex: @"C:\Docs\document.pdf"
using (Watermarker watermarker = new Watermarker(Constants.InDocumentPdf))
{
    PossibleWatermarkCollection watermarks = watermarker.Search(new TextSearchCriteria(new Regex(@"someurl\.com")));
    for (int i = watermarks.Count - 1; i >= 0; i--)
    {
        // Ensure that only hyperlinks will be removed.
        if (watermarks[i] is HyperlinkPossibleWatermark)
        {
            // Output the full url of the hyperlink
            Console.WriteLine(watermarks[i].Text);
 
            // Remove hyperlink from the document
            watermarks.RemoveAt(i);
        }
    }
 
    watermarker.Save(Constants.OutDocumentPdf);
}

The complete ready to run code sample is available on GitHub.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Watermark Product Family | Tagged , , , , , , , , | Leave a comment

Accept or reject PDF comparison changes in C#

You might be looking for an API that provides ability to apply or discard specific changes between source and target documents and save resultant document with (or without) selected changes.

About the API

GroupDocs.Comparison for .NET is a back-end API that can be integrated in any .NET (existing or new) application without any third party tool/software dependency. Currently API supports all these file formats.

Steps to apply/reject changes

  • Instantiate Comparer object with source document path or stream
  • Call Add method  and specify path target document path or stream
  • Call Compare method
  • Call GetChanges method and obtain detected changes list
  • Set ComparisonActionof needed change object to ComparisonAction.Accept or ComparisonAction.Reject value
  • Call ApplyChanges method and pass collection of changes to it

ApplyChangeOptions Class

Changes – List of changes that must be applied (or not) to the resulting document.

You can see that in “without change.pdffile the text “Powerful document comparison APIs” is not highlighted that means its the rejected one. A complete list of changes is available in ChangeInfo.

Please access API from download section. Feel free to share your concerns on forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Comparison Product Family | Tagged , , , | Leave a comment

Lock Watermarks in Word Documents using C#

Watermarking is a famous way of adding labels to the documents that may indicate the state of the document such as draft, confidential, etc. It can be used to add a company’s logo behind the text of the document to avoid the ownership dispute. In some cases, people add watermarks to the documents to protect the content. But, are you sure that the content is protected and no one can remove those watermarks? There is no guarantee because various third-party tools are available that can wipe out the watermarks from the documents.

In such scenarios, you need some way to lock the watermarks to restrict the editing. In this article, I’ll show you how to protect your Word documents and prevent the editing of the watermarks programmatically in C# using GroupDocs.Watermark for .NET API.

Locking Watermark in Word Document

GroupDocs.Watermark provides 5 variants of locking Word document when adding watermark.

Allowing Revisions Only

In this case, the user will only be able to add revision marks to the Word document. The content of the document as well as the watermark will be read-only.

Allowing Comments Only

If this type of restriction is applied then the user will only be able to modify the comments in the document.

Allowing Form Fields Only

With this option, the document is split into one-page sections and a locked section with the watermark is added between each two adjacent document sections.

Read Only with Editable Content

In this case, the document is marked as read-only but all the content except the watermark is marked as editable.

Completely Read Only

In this case, the entire document is marked as read-only and nothing can be edited.

Source Code

The following code sample shows how to impose the above-mentioned restrictions on a Word document when adding the watermark.

Have any questions or queries? Do contact us on our forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Watermark Product Family | Tagged , , , | Leave a comment

Manage contrast when converting a document to Grayscale Image in C#

The first and foremost question in your mind could be, is it about converting an RGB image to Grayscale only? No, you can convert any supported file format (including images) to black-and-white or gray monochrome image. The contrast ranges from black at the weakest intensity to white at the strongest.
See the below demonstration. We converted first page of a PDF file to Grayscale image. You can manage brightness, contrast or gamma of the resultant image as well. Code will be explained later in this post.

This feature is quite helpful if you are going to do image processing. Because RGB image is represented by 3 channels and contains a lot of data/noise. Hence, more computational power is required to process such an image. On the other hand, Grayscale image makes this process comparatively easy.

Along with common conversion options, we’ll see the advanced properties that we can implement when converting a document to image using GroupDocs.Conversion for .NET.

Understanding the API usage and implementation

GroupDocs.Conversion for .NET is a back-end API that is used for documents conversion between multitude of supported file formats and image types. Conversion results can easily be customized and tuned with multiple and flexible options.
If we talk about its implementation, its a back-end API that can be integrated or implemented in any .NET application without any dependency. You can use it in your (existing or new) web, desktop or console applications.

ImageConvertOptions Class

The ImageConvertOptions class exposes these members. There are a lot of interesting yet helpful properties. Some of them are implemented in the above code. Other than that output image’s rotation angle, width and height can be controlled. You can also add Watermark in the resultant file. API also provides format specific conversion options (e.g. JpegOptions, PsdOptions).
For example, PsdOptions is a subset of ImageConvertOptions which allows enhanced control over conversions to PSD format. Learn more in this article.

Helpful Resources

Please access API from download section. We also have an open-source example project for your ease to evaluate the API features and it can be downloaded from GitHub. Feel free to share your concerns on forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Conversion Product Family | Tagged , , | Leave a comment

Build Reports using CSV Data Sources in GroupDocs.Assembly for Java 19.10

We have recently released version 19.10 of GroupDocs.Assembly for Java that brings an exciting feature of building reports using CSV data sources. Along with this, we have simplified the process of working with XML data sources. In this article, I’ll show you how to populate document templates using CSV as well as XML data sources.

Working with CSV Data Source

The Comma Separated Values (CSV) is a file format for storing the data in the form of plain text where the values are separated by commas. CSV is widely used for exchanging data among the applications and therefore, we have added the support of CSV to be used as a data source.

We have introduced the CsvDataSource class to access the CSV data sources where its instance is passed to the DocumentAssembler. Although CSV as a format does not define a way to store values other than of the string type, CsvDataSource is capable of recognizing values of the following types by their string representations:

  • Integer
  • Long
  • Double
  • Boolean
  • Date

For recognition of data types, string representations of corresponding values must be formed using invariant culture settings.

Let’s have a look at how a CSV data source can be used for populating the templates. In the template documents, a CsvDataSource instance should be treated in the same way as if it was a DataTable instance. The following are the sample CSV, the template and the code to populate the template.

CSV

John Doe,30,1989-04-01 4:00:00 pm
Jane Doe,27,1992-01-31 07:00:00 am
John Smith,51,1968-03-08 1:00:00 pm

Template

<<foreach [in persons]>>Name: <<[Column1]>>, Age: <<[Column2]>>, Date
of Birth: <<[Column3]:"dd.MM.yyyy">>
<</foreach>>
Average age: <<[persons.average(p => p.Column2)]>>

Code

The following is the report that will be generated as a result.

Name: John Doe, Age: 30, Date of Birth: 01.04.1989
Name: Jane Doe, Age: 27, Date of Birth: 31.01.1992
Name: John Smith, Age: 51, Date of Birth: 08.03.1968
Average age: 36

In this example, we have used Column1, Column2 and so on for the column names in the template. However, if the first row in the CSV file contains the column names, you can configure CsvDataSource to treat the first row as column names. The CSV, template, and code, in this case, would be the following:

CSV

Name,Age,Birth
John Doe,30,1989-04-01 4:00:00 pm
Jane Doe,27,1992-01-31 07:00:00 am
John Smith,51,1968-03-08 1:00:00 pm

Template

<<foreach [in persons]>>Name: <<[Name]>>, Age: <<[Age]>>, Date of
Birth: <<[Birth]:"dd.MM.yyyy">>
<</foreach>>
Average age: <<[persons.Average(p => p.Age)]>>

Code

Working with XML Data Sources

To access XML data while building a report, you can use the facilities of DataSet to read XML into it and then pass it to the assembler as a data source. However, if your scenario does not permit to specify XML schema while loading XML into DataSet, all attributes and text values of XML elements are loaded as strings. Even when XML schema is not provided,
XmlDataSource is capable of recognizing values of the following types by their string representations:

  • Integer
  • Long
  • Double
  • Boolean
  • Date

The following are different scenarios to deal with the XML data sources:

1. If a top-level XML element contains only a sequence of elements of the same type.

In this case, an XmlDataSource instance should be treated in the same way as if it was a DataTable instance.

XML

<Persons>
   <Person>
      <Name>John Doe</Name>
      <Age>30</Age>
      <Birth>1989-04-01 4:00:00 pm</Birth>
   </Person>
   <Person>
      <Name>Jane Doe</Name>
      <Age>27</Age>
      <Birth>1992-01-31 07:00:00 am</Birth>
   </Person>
   <Person>
      <Name>John Smith</Name>
      <Age>51</Age>
      <Birth>1968-03-08 1:00:00 pm</Birth>
   </Person>
</Persons>

Template

<<foreach [in persons]>>Name: <<[Name]>>, Age: <<[Age]>>, Date ofBirth: 
<<[Birth]:"dd.MM.yyyy">>
<</foreach>>
Average age: <<[persons.Average(p => p.Age)]>>

Code

2. If a top-level XML element contains attributes or nested elements of different types.

In this scenario, an XmlDataSource instance should be treated in template documents in the same way as if it was a DataRow instance.

XML

<Person>
   <Name>John Doe</Name>
   <Age>30</Age>
   <Birth>1989-04-01 4:00:00 pm</Birth>
   <Child>Ann Doe</Child>
   <Child>Charles Doe</Child>
</Person>

Template

Name: <<[Name]>>, Age: <<[Age]>>, Date of Birth:
<<[Birth]:"dd.MM.yyyy">>
Children:
<<foreach [in Child]>><<[Child_Text]>>
<</foreach>>

Code

For more details on how to use CSV as well as XML data sources and create the document templates, please visit this documentation article. Try out all these features at your end by downloading or cloning the examples project from GitHub. If you find anything confusing for you, do let us know via our forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Assembly Product Family | Tagged , , | Leave a comment

Move Pages in PDF using GroupDocs.Merger for .NET and Java

If you are a regular user of Word Processing, Spreadsheet, Presentation and PDF documents, you can understand how struggling the task of typing up these documents is. Considering the efforts that requires just in the area of typing, formatting and correcting the layout is significantly difficult as it usually requires multiple clicks to complete a simple task of formatting.

Although some documents tool creates new pages automatically, as needed, but it is quite difficult to move pages within the documents. Consider a scenario where you have just completed a long report. It includes sections, graphs, tables, and images. You received a feedback from your manager that the content on a page which should have been a part of another one. You realize that it is not easy to copy-paste the material as it contains different images that need to be taken care of. What will you do?

Either you can go for the long procedure of selecting the navigation panel, clicking multiple buttons, and then achieve your purpose or you can use just a line of code to do it all for you. If you are not a regular user of these documents, then you can opt for the long procedure. However, there is a simple way to complete this task.

Move Pages in PDF using GroupDocs.Merger

If most of the documents of your company rely on on the tools mentioned above, you should consider making this task simpler for yourself by using GroupDocs.Merger. GroupDocs.Merger is easy-to-use API and available in both .NET and Java, which allows you to manipulate document structure and rearrange pages through its MovePage option. Instead of clicking on multiple buttons to complete a small task of moving pages, follow these simple steps to achieve your purpose:

  • Specify current and new page numbers
  • Load you desired document
  • Call MovePage method with MoveOptions
  • Save the resultant document

Following code demonstrates how a PDF document page can be moved using GroupDocs.Merger for .NET:

Java developers can move page in a PDF document using GroupDocs.Merger for Java with following line of codes:

See the following screenshots of source and resultant PDF documents:

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Merger Product Family | Tagged , , , , , , , , , , , , , , , , , , , , , , , , | Leave a comment

Delete one or more pages from a document in C#

We’ll see how to remove a single page or a collection of specific page numbers from the source document using GroupDocs.Merger for .NET. This is a back-end API that can be integrated in any new or existing .NET application without any dependency.

Does it require any software installation?
Absolutely not. It doesn’t matter if MS Office or any PDF reader is installed on your computer. Using a single GroupDocs.Merger for .NET DLL you can develop a web, console or desktop application to delete the pages.

How simple is it?
Below are the steps to remove document page(s):

  • Initialise RemoveOptions class with page numbers to remove
  • Instantiate Merger object with source document path or stream
  • Call RemovePages method and pass RemoveOptions object to it
  • Call Save method and pass desired file path to save resultant document

Explore API documentation to learn more about the supported features. You can download or clone this open-source example project for API evaluation. In case of any issue, post it on the forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Merger Product Family | Tagged , | Leave a comment

Extract Data from Invoices or Receipts in C#

Invoices and receipts are the documents that are used to record the transactions in a particular format when buying or selling of the services or goods is involved. Things have gone digital and with the popularity of online shopping, digital invoices are widely used. Processing a number of digital invoices and extracting the information manually is a complex as well as time taking process. Thus, you need a faster yet efficient way for such a case. So in this article, I am going to show you how to extract data from a PDF invoice or receipt programmatically in C# using GroupDocs.Parser for .NET API.

Workflow for Extracting Data from a PDF Invoice

The following is the workflow of how to extract the data from a PDF invoice using GroupDocs.Parser for .NET.

  • Create table parameters for extracting data from the tables.
  • Create template items for extracting data from fields.
  • Parse the invoice according to the given template.
  • Extract the data.

The Invoice

The following is the screenshot of a sample PDF invoice that I’ll use for extracting the data. You can download this invoice from our GitHub repository.

The Code

  • Create the template for the given invoice (read more about templates).
  • Parse the invoice and extract data.

The Output

To explore more about GroupDocs.Parser for .NET API, visit the documentation. Reach us at our forum in case of any questions or queries.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Parser Product Family | Tagged | Leave a comment

Control documents comparison sensitivity in Java

Document comparison is one of the most common procedure that is practiced almost in all of the digital businesses. And the objective is same, highlight the inserted or deleted items. Detect the style changes and generate a summary. Let’s see how GroupDocs.Comparison for Java can help us with this scenario. This is a back-end API that can be integrated in any Java application irrespective of the frameworks. Explore API documentation to learn more about the supported features and file formats.

Those who are already using the API, we’ll discuss new features and improvements introduced in version 19.10.
How about controlling the document comparison sensitivity? We’ve added a sensitivity property in ComparisonSettings class. This option defines limit in percents, when an element is detected as deleted or inserted.

Minimal value
Minimal value is 0, comparison process does not occur for any length of sequences of two compared objects.

Value by default
The default percentage is 75, comparison occurs when the percentage of deleted or inserted elements in relation to all elements does not exceed 75%.

Maximum value
That is 100%. Comparison occurs at any length of a common sub-sequence of two compared objects.

Now let’s understand this with a use-case. Suppose we have two words:

  1. oneSource
  2. twoTarget

These two words have very small common sub-sequence. Therefore, when comparing them at 75% accuracy, it is not taken into account and we get a completely removed and inserted word as follows:

(twoTarget)[oneSource]

But at 100% accuracy, this sub-sequence will be treated or represented in a different way, despite the fact that it consists of two letters.

(tw)o[n](Targ)e[Source](t)

Isn’t it amazing? You can now get briefed comparison results by just controlling the sensitivity.

Did you ever think of getting coordinates of document changes or differences? It could be confusing at first but let me elaborate this. In your output or resultant document, you get every detail of inserted, deleted or style changed items. The new thing is that you can get coordinate details where changes or differences actually occurred. Currently this feature is supported for only Word, PDF, Slide and Diagram formats.

You can get the API from download section. We also have an open-source GitHub example project. However, if you face any issue while evaluating the API, you can post it on forum.

Share on FacebookTweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Comparison Product Family | Tagged , , , | Leave a comment