Show & Hide Page Borders while Converting Documents to HTML in C#

Convert DOCX to HTML in CSharp

Either you want to convert a document to HTML format to get the content for your website, or you have come across an online document submission website that requires documents to be submitted in HTML format. In either case, you need a DOC to HTML converter. However, if you need to convert your documents to HTML programmatically, then this article is for you only. This article will cover the following ways to convert documents to HTML in C#:

  • Simplest conversion of documents like DOCX to HTML in C#.
  • Convert to HTML with customized options.
  • Convert using the option to show or hide page borders.

Prerequisite – C# Document Conversion Library

GroupDocs.Conversion for .NET is an easy to use powerful API with the ability to convert any document from the wide list of supported document formats into any supported target formats. You may download the API from the downloads section or install it from NuGet.

Convert DOCX to HTML in C# – Simple

This is the simplest and very useful conversion. I better say that you can convert any of your documents to the HTML format. Just check your format from the supported formats list and go-ahead to get it converted.

Your document will be converted to HTML and the resultant document will be there in your repository. The following small code sample shows the conversion of a DOCX file into HTML using the Converter class in C#.

// Converting DOCX to HTML in C#
using (Converter converter = new Converter("document.docx"))
    MarkupConvertOptions options = new MarkupConvertOptions();
    converter.Convert("converted.html", options);

Convert to HTML with Customized Options

GroupDocs.Conversion provides different other options to get the desired conversion result. The customized options include:

  • Fixed Layout
  • Fixed Layout – Show Borders
  • Format
  • Page Number
  • Pages
  • Pages Count
  • Use PDF
  • Watermark
  • Zoom

You may visit the documentation or GitHub samples to see each option in detail. I will show some of the customizations while again converting the DOCX to HTML format in below code sample.

// Converting DOCX to HTML in C# with advance options.
using (Converter converter = new Converter("document.docx"))
    MarkupConvertOptions options = new MarkupConvertOptions
    { // Setting customized options
        PageNumber = 2,
        PagesCount = 1,
        FixedLayout = true
    converter.Convert("converted.html", options);

Convert to HTML – Show or Hide Page Borders

Last but not least, you can now control the visibility of page borders while converting documents to HTML in C#. The GroupDocs.Conversion for .NET – Release v20.3 gives this control to the C# programmers along with some other new features and improvements. The below example shows that by setting the FixedLayoutShowBorders property of MarkupConvertOptions class to true or false, you can show or hide the page borders in the resultant HTML document.

// Converting DOCX to HTML in C# with show or hide borders control.
using (Converter converter = new Converter("document.docx"))
    MarkupConvertOptions options = new MarkupConvertOptions
        PageNumber = 2,
        FixedLayout = true,
        PagesCount = 1,
        FixedLayoutShowBorders = false
    converter.Convert("converted.html", options);

Images below showing the original DOCX document and the converted HTML with and without page borders.

Docx document to convert into HTML
Original DOCX Document
HTML File with page borders and no borders.
The above figure shows the HTML files that are converted from DOCX with show borders and do not show borders options.

See more about GroupDocs.Conversion

Let’s talk more @ Free Support Forum.

Posted in GroupDocs.Conversion Product Family | Tagged , , , , | Leave a comment

Convert JPG, PNG, GIF, and TIFF Images to PDF in C#

We convert an Image to PDF because it gives assurance that the image will display correctly across devices without being altered. PDF images are ideal for printing and for storing images online when we intend them to be downloaded. PDFs keep the images in one document so viewers can print and save them easily.

We will use the GroupDocs.Conversion for .NET library to convert raster images to PDF. The library lets us convert the following image formats to PDF:

Convert JPG and PNG to PDF
  • JPG
  • TIFF
  • TIF
  • JPEG
  • PNG
  • GIF
  • BMP
  • ICO
  • CMX
  • DIB
  • JPC

Please check the documentation for the complete list of supported formats.

Convert Image to PDF

First, we need to install GroupDocs.Conversion’s NuGet package. The Development Environment Installation and Configuration article explains in detail the steps to install the NuGet package in Visual Studio.

Conversion to PDF format could be triggered by following below steps:

  • Create a new instance of the Converter class and pass the source document path as a constructor parameter.
  • Instantiate PdfConvertOptions class.
  • Call Convert method of Converter class instance and pass filename for the converted document and the instance of PdfConvertOptions from the previous step.
using (Converter converter = new Converter("C:\\Sample.jpg"))
    PdfConvertOptions options = new PdfConvertOptions();
    converter.Convert("C:\\Converted.pdf", options);

We tried to convert the following image to PDF and got this output file.

JPEG Image

Convert to PDF with advanced options

GroupDocs.Conversion provides PdfConvertOptions to give us control over conversion results when converting Image to PDF. Some of the additional options are:

  • Width – desired image width after conversion
  • Height – desired image height after conversion
  • MarginTop – desired page top margin after conversion
  • MarginBottom – desired page bottom margin after conversion
  • MarginLeft – desired page left margin after conversion
  • MarginRight – desired page right margin after conversion
  • Rotate – page rotation. Available options are: None, On90, On180, On270

The following code sample is using these additional options to convert an image to PDF.

using (Converter converter = new Converter("C:\\Data\\Sample.jpg"))
    PdfConvertOptions options = new PdfConvertOptions
        Width = 233,
        Height = 175,
        MarginTop = 20,
        MarginBottom = 20,
        MarginLeft = 20,
        MarginRight = 20,
        Rotate = Rotation.On180
    converter.Convert("C:\\Data\\AdvancedConversion.pdf", options);

and generates the following output file.

Output Document After Conversion

More resources

Please check the documentation to know about the features that GroupDocs.Conversion for .NET API offers and GitHub Examples to see these features in action.

Posted in GroupDocs.Conversion Product Family | Tagged , , , , , , , | Leave a comment

Search Image Signatures in Word, Excel, PowerPoint, PDF Documents

Electronic Signature is the digital data that is attached to an electronically transmitted document. It verifies the sender’s intention to sign the document.

GroupDocs.Signature for .NET. Search Image Signatures in Documents

As a developer, you can programmatically sign documents and also verify if the document is properly signed with the right signature. GroupDocs.Signature for .NET API provides many such features and gives you complete control over the electronic signing process. It provides different electronic signature implementations, like:

  • Text Signatures (text, annotations, watermarks, stickers)
  • Image Signature – with options like image effects and rotation.
  • Digital Signature – based on digital certificates.
  • Barcode Signature
  • QR code Signature
  • Stamp Signature
  • Metadata Signature
  • FormField Signature

This article shows how a C# developer may search electronic signatures of various types within any document that is supported by the GroupDocs.Signature API for .NET.

Search QR Code signatures in documents

Following is the simplest search method that shows how to search QR Code Signatures within a PDF document. You can use the same line of code to search in any of the supported file formats.

// Search QR Code Signatures in PDF document using C#
using (Signature signature = new Signature("signed.pdf"))
    // search for signatures in document
    List<QrCodeSignature> signatures = signature.Search<QrCodeSignature>(SignatureType.QrCode);
    Console.WriteLine($"\nSource document contains following signatures.");
    foreach (var qrCodeSignature in signatures)
        Console.WriteLine($"QRCode signature found at page {qrCodeSignature.PageNumber} with type {qrCodeSignature.EncodeType.TypeName} and text {qrCodeSignature.Text}");

Search Barcode, QR code, and other signatures in documents

Whether it is a Barcode signature, QR Code signature, Text signature or even the hidden Metadata signature, it is quite simple to find any type of signature present in the document. Below code shows how the different signature types can be extracted from any document.

using (Signature signature = new Signature("signed.pdf"))
    // search for signatures in document
    SearchResult result = signature.Search(SignatureType.Text, SignatureType.Barcode, SignatureType.QrCode, SignatureType.Metadata);
    if (result.Signatures.Count > 0)
        Console.WriteLine($"\nSource document contains following signatures.");
        foreach (var resSignature in result.Signatures)
            Console.WriteLine($"Signature found at page {resSignature.PageNumber} with type {resSignature.SignatureType} and Id#: {resSignature.SignatureId}");
        Console.WriteLine("No signature was found.");

Search Image Signature in PDF documents and grab images content

The .NET Signature API not only allows getting all the signatures of various types, but it also grabs the image data content for presentations, spreadsheets, word processing and PDF documents using its API v20.2. Following is the source code showing how to grab the image content after the successful image signature search from a PDF document.

using (Signature signature = new Signature("signed.pdf"))
    // setup search options
    ImageSearchOptions searchOptions = new ImageSearchOptions()
        // enable grabbing image content feature
        ReturnContent = true,
        // set minimum size if needed
        MinContentSize = 0,
        // set maximum size if needed
        MaxContentSize = 0,                    
        // specify exact image type to be returned
        ReturnContentType = FileType.JPEG,                                   
    // search document
    List<ImageSignature> signatures = signature.Search<ImageSignature>(searchOptions);
    Console.WriteLine($"\nSource document contains following image signature(s).");
    // output signatures
    foreach (ImageSignature imageSignature in signatures)
        Console.Write($"Found Image signature at page {imageSignature.PageNumber} and size {imageSignature.Size}.");
        Console.WriteLine($"Location at {imageSignature.Left}-{imageSignature.Top}. Size is {imageSignature.Width}x{imageSignature.Height}.");
    //Save signature images
    string outputPath = "Output";
    if (!Directory.Exists(outputPath))
    foreach (ImageSignature imageSignature in signatures)
        Console.Write($"Found Image signature at page {imageSignature.PageNumber} and size {imageSignature.Size}.");
        Console.WriteLine($"Location at {imageSignature.Left}-{imageSignature.Top}. Size is {imageSignature.Width}x{imageSignature.Height}.");
        string outputFilePath = System.IO.Path.Combine(outputPath, $"image{i}{imageSignature.Format.Extension}");
        using (FileStream fs = new FileStream(outputFilePath, FileMode.Create))
            fs.Write(imageSignature.Content, 0, imageSignature.Content.Length);

Key Resources for GroupDocs.Signature for .NET

Explore more about the GroupDocs.Signaure for .NET API. You can freely contact support if you need any help:

Posted in GroupDocs.Signature Product Family | Tagged , , , , , , | Leave a comment

Compare Two Files or More in C#

Document comparison is one of the most common requirements for today’s programming world. Whether it is to compare word files, compare excel files, PDF documents or even compare text files or any other document format, accuracy is the key factor while comparing.

Compare Files with Document Comparison API for .NET Developers

This article will give you the idea, how GroupDocs.Comparison facilitates programmers to compare any two or more documents in many ways. On-Premise APIs of GroupDocs.Comparison are currently available for .NET and Java, however, this article is inclined towards C# developers.

Compare Excel, Word Files or any Document in C#

GroupDocs.Comparison allows developers to compare two documents (in fact more than 2). The resulting document shows the changes between the two files in comparison. Below mentioned code shows how you can compare two excel files in just 3 lines of code in C#.

  1. Instantiate the Comparer object with the source document path.
  2. Call Add method to specify the target document path.
  3. Call Compare method.
  4. That’s it.
using (Comparer comparer = new Comparer(“source.xlsx”))

Comparing excel spreadsheets or Microsoft Word documents are just among the subset of comparisons that are supported by the .NET API of GroupDocs.Comparison. Below is the list of supported formats. You can visit the documentation to stay updated.

Document TypeFile Formats
PresentationsPPT, PPTX, PPS, PPSX, POT, POTX
OpenDocumentODT, ODP, OTP, ODS, OTT
Microsoft Visio DrawingsVSD, VSDX, VSS, VST, VDX
Note TakingONE

Compare two or more Spreadsheets or OneNote Documents in C#

After the release of GroupDocs.Comparison for .NET 20.2, the API now supports:

  • Comparison of more than two Microsoft Excel and OpenOffice spreadsheets (XLS, XLSX, ODS, CSV, …)
  • Compare multiple Microsoft OneNote documents.

The API already supports the comparison of multiple files for various document formats. Following code snippet shows how quickly, multiple excel files can be compared in C#.

using (Comparer comparer = new Comparer(“source.xlsx”)

Compare Documents from Stream in C#

As a programmer, you are not only allowed to compare documents that are available on local storage, in fact, we can compare documents from the stream.

  1. Just initialize the Comparer object with the source document stream.
  2. Call Add method and specify the target stream.
  3. Call Compare method
using (Comparer comparer = new Comparer(File.OpenRead(“source.docx”))

Compare Password Protected Word Documents / Excel Spreadsheet in C#

Password protection is common in the official documentation. Using the document comparison .NET API, it allows its users/developers to compare password-protected documents.

Just a little change in the code as compared to the code for comparing documents that are not password-protected. While loading the document, use LoadOptions to specify the document password. Below is the sample comparison code for your assistance.

using (Comparer comparer = new Comparer("source.docx", new LoadOptions() { Password = "1234" }))
    comparer.Add("target1.docx", new LoadOptions() { Password = "5678" });
    comparer.Add("target2.docx", new LoadOptions() { Password = "5678" });
    comparer.Add("target3.docx", new LoadOptions() { Password = "5678" });

Comparison of Documents with Specific Settings

One step ahead of just comparing, using the code similar to the mentioned below, you can compare multiple documents with your customized comparison settings.

CompareOptions provides you the opportunity to specify your comparison options like font styling for detected changes etc.

using (Comparer comparer = new Comparer(“source.docx”)
    CompareOptions compareOptions = new CompareOptions()
        InsertedItemStyle = new StyleSettings()
            FontColor = System.Drawing.Color.Yellow
    comparer.Compare(“result.docx”, compareOptions);

Compare Programming Language Files in C#

GroupDocs continuously increasing the support to compare more file formats. After the release v 20.2, you can now also compare JSON files using .NET API. Following are the programming language file formats that are recently added to the supported document formats list:

ActionScriptObjective C/C++
JavaShell/Batch Script, Log, Diff, Config, LESS

Let’s Talk

You can build your own application using the above-highlighted features. We will be delighted if you contact us on the forum to discuss, solving a problem, or share your feedback.

Posted in GroupDocs.Comparison Product Family | Tagged , , , , , , , , , , | Leave a comment

Convert PowerPoint PPT, PPTX and OpenOffice Presentations to PDF in C#

PDF is no doubt the Portable Document Format, which is one of the most commonly used file formats. PPT and PPTX formats of Microsoft PowerPoint shares the popularity in business documents. Due to the popularity of both the document formats and fixed layout nature of PDF format, there comes the need to convert PPT/PPTX to PDF format.

Convert PPT to PDF

Considering the .NET developers today, this article will be providing the solution to the above-mentioned famous and needed file format conversion. GroupDocs supports the conversion of 50+ document formats, hence provides On-Premise APIs (.NET & Java), Cloud APIs and online Conversion Apps. After this article, you will get familiar with different ways to convert Microsoft and OpenOffice presentations using GroupDocs.Conversion for .NET.

Convert PPT to PDF in C#

GroupDocs.Conversion has made this so easy; the popular and demanding conversion of presentation files. Just with the below-mentioned two lines of CSharp code, you can quickly convert any type of presentation like PPTX or PPT to PDF.

using (Converter converter = new Converter("sample.pptx"))
    PdfConvertOptions options = new PdfConvertOptions();
    converter.Convert("converted.pdf", options);

Convert Specific Slides of PPT to PDF in C#

We could have a requirement to convert only the selected slides instead of converting the whole presentation. GroupDocs.Conversion allows converting the specific slides of a presentation to the resultant PDF document. Below is the C# source code that shows, how to achieve this.

using (Converter converter = new Converter("sample.ppt"))
    PdfConvertOptions options = new PdfConvertOptions
        Pages = new List<int>{ 1, 3 }
    converter.Convert("converted.pdf", options);

Convert Consecutive Pages of PPTX to PDF using C#

With the little modification in the requirement, below is the little change in the code. Certain consecutive pages of the presentation can be selected to get these converted into PDF format. Just set the PageNumber and PageCount properties of the ConverOptions Class.

using (Converter converter = new Converter("sample.pptx"))
    PdfConvertOptions options = new PdfConvertOptions
        PageNumber = 2,
        PagesCount = 3
    converter.Convert("converted.pdf", options);

Possible Conversions of PPT/PPTX

This is not only the PDF that could be the target document format while conversion. We can refer to the documentation for all the possible conversions. More important for developers, we can retrieve all the possible conversion formats of PPT/PPTX presentations by simply calling the GetPossibleConversions() method of the Converter class.

const string sourceFile = "sample.pptx";
using (Converter converter = new Converter(sourceFile))
    PossibleConversions conversions = converter.GetPossibleConversions();
    Console.WriteLine("{0} is of type {1} and could be converted to:", sourceFile, conversions.Source.Extension);
    foreach (var conversion in conversions.All)
        Console.WriteLine("\t {0} as {1} conversion.", conversion.Format, conversion.IsPrimary?"primary": "secondary");

PPT to PDF Conversion with Advanced Options

There are many more options while converting the presentations. These options are rarely needed, however when required, they prove their importance.  PdfConvertOptions gives control over conversion results while converting to PDF. Along with the common conversion options, PdfConvertOptions has many additional options that can be seen in detail from the documentation. Just for an overview, we can customize the PPT conversion with the mentioned options and much more:

using (Converter converter = new Converter("sample.ppt"))
    PdfConvertOptions options = new PdfConvertOptions
        PageNumber = 2,
        PagesCount = 1,
        Rotate = Rotation.On180,
        Dpi = 300,
        Width = 1024,
        Height = 768
    converter.Convert("converted.pdf", options);

Add Watermark while converting PPTX or PPT to PDF in C#

Want to secure your presentation while converting it to PDF format? Leave a watermark on the resultant PDF. Below mentioned source code shows how to put a watermark when a PPT/PPTX presentation is converted to PDF format.

using (Converter converter = new Converter("sample.ppt"))
    WatermarkOptions watermark = new WatermarkOptions
        Text = "GroupDocs Watermark",
        Color = Color.Red,
        Width = 100,
        Height = 100,
        Background = true
    PdfConvertOptions options = new PdfConvertOptions
        Watermark = watermark
    converter.Convert("converted.pdf", options);

Let’s Talk

You can build your own application using the above-highlighted features. We will be delighted if you contact us on the forum to discuss, solving a problem or share your feedback.

Posted in GroupDocs.Conversion Product Family | Tagged , , , , , , | Leave a comment

Convert WebP to JPG, PNG, and PDF in Java

WebP is the image format introduced by Google that provides lossless and lossy compression for images on the web. WebP images are smaller in size as compared to the well known and vastly used image formats like PNG and JPG, hence provides faster web experience.

Free Online WebP to JPG Converter

If you are here just to online convert your WebP files to JPG, PNG, or PDF documents, use the free conversion tool by GroupDocs. However, if you want to achieve the exact thing programmatically, continue reading.

Despite the fact that WebP images give transparency like PNG, animate like GIF, and the most important for any web developer is the smaller size than comparative quality JPG format, it isn’t still universally compatible. This incomplete support and compatibility issue sometimes force developers to convert the WebP image into JPG, PNG or other formats.

Convert WebP image to JPG, PNG or PDF formats.

GroupDocs provides the solution to convert 50+ document and image file formats. As a developer, you can use GroupDocs.Conversion on-premise and cloud APIs to convert WebP images in your Java, .NET, and many other supported programming languages based applications. As a normal user, you can use GroupDocs.Conversion App to get your WebP image files converted.

Convert WebP to JPG in Java

While using GroupDocs.Conversion API, you can get the possible conversion formats of the source document by using the getPossibleConversions() method of Class ConversionHandler. You can either pass the source document as an InputStream or just pass the file extension of the source document to get the possible conversion formats.

Below source code shows how easily you can now convert the WebP image to JPG format. For the conversion of WebP file to some other supported format, you just have to change the output format of the image by setting the appropriate ImageFileType. For instance, to convert WebP to PNG, just change the below ImageFileType from JPG to PNG.

ConversionHandler conversionHandler = new ConversionHandler(Utilities.getConfiguration());
// Create and set Image Saving Options
SaveOptions saveOption = new ImageSaveOptions();
// Convert the WebP image to JPG or PNG format
String fileName = "image.webp";
ConvertedDocument convertedDocumentPath = conversionHandler.convert(fileName, saveOption);
SaveInfo saveInfo = + "." + convertedDocumentPath.getFileType());

Convert WebP to PDF in Java

WebP image can not just only be converted into any other image file format, however, GroupDocs.Conversion API allows conversion into many document file formats. The following example shows how a Java developer can quickly convert WebP image into PDF (Portable Document Format.)

ConversionHandler conversionHandler = new ConversionHandler(Utilities.getConfiguration());
// Create and set PDF Save Options
PdfSaveOptions saveOption = new PdfSaveOptions();
// Convert the source WebP image to PDF document.
String sourceFileName = "image.webp";
ConvertedDocument convertedDocumentPath = conversionHandler.convert(sourceFileName, saveOption);
SaveInfo saveInfo = + "." + convertedDocumentPath.getFileType());

There are many other open-source examples that are publicly available at GitHub Repository. Download the source code and quickly run the examples using the getting started guide. In case of any difficulty, look at the documentation or reach us at any time on the forum.

Have a nice coding day!

See Also

Posted in GroupDocs.Conversion Product Family | Tagged , , , , | Leave a comment

GroupDocs.Total Discount Offer ends January 31st

LinkedIn Google+ Twitter Facebook
Share this issue:

Monthly Newsletter

January 2020

25% off Conholdate.Total
Hurry! Offer ends January 31st.

Get 25% off GroupDocs.Total for .NET and Java. Quote HOLOFF2019 when placing your order.

Buy Now

This offer is only available on new GroupDocs.Total purchases and cannot be used in conjunction with other offers, renewals or upgrades. Only available directly from, not through third parties or resellers. Ts&Cs Apply.

Product News
Product News
Product News
From the Library
From the Library
From the Library
GroupDocs for .NETGroupDocs for JavaGroupDocs for Cloud APIs
Product Releases and Updates
Posted in Customer Newsletters | Tagged , , , | Leave a comment

Important Bug Fixes in GroupDocs.Viewer for .NET 19.11 document viewer API

We have rolled out another update for GroupDocs.Viewer for .NET featuring some important bug fixes as well as an improvement related to the MSI package. This release hasn’t brought any new feature, still, it has addressed some important issues related to PDF, DWG and ODG file formats. Furthermore, a few compatibility issues which appeared under .NET Standard 2.0 have been resolved. So let’s have a brief overview of the bug fixes and improvements we have introduced in v19.11.

Issue: Rendering DWG to image (PNG/JPG) or PDF resulted in an empty output

This issue appeared for some specific DWG files when the contents of the source files were missing in the output and it resulted in blank/empty images or PDF documents.

Issue: The code hangs when rendering PDF document to HTML

One of our customers faced an issue where the API was taking too long to render a particular PDF document into HTML. We have resolved this issue and improved the performance of the API when rendering such PDF documents.

Issue: Console output is printed when rendering ODG images

In the previous versions, the unnecessary messages were printed in the console window while rendering the ODG images. Although it wasn’t affecting API’s functionality or the output, it might have created confusion for the developers. We have fixed this issue to prevent unexpected messages to be printed in the console window.

Issue: Compatibility issues under .NET Standard 2.0

In the previous release, we added the support of .NET Standard 2.0 for cross-platform development using GroupDocs.Viewer for .NET. This enhancement raised some internal compatibility issues, however, we have fixed these issues in v19.11.

Improvement: New ProjectGuid and UpgradeCode for MSI package

We have updated the unique identifier that is used by OS to identify the application installed with an MSI package. This update would require you to manually uninstall the previous version of GropuDocs.Viewer for .NET before installing v19.11 using the MSI package.

Since the updates are always important, we would recommend you to upgrade to v19.11 in your applications. In case you would face any issue or have any confusion, feel free to share with us via our forum.

Posted in GroupDocs.Viewer Product Family | Tagged , , , , , , , , , , | Leave a comment

Classify text using IAB-2 or Document taxonomies in C#

A taxonomy or classification is basically an approach in which text is systematically identified and then organized. When you are dealing with a bunch of data (text based or documents), it becomes hard to find a topic of your need until and unless this data is classified or organized. Hence, you have to classify text in order to fetch data/information quickly.

GroupDocs.Classification for .NET

GroupDocs offers a programmable document or text classification API for .NET developers. You just have to add a single DLL (GroupDocs.Classification for .NET) as a reference in your .NET project. API allows developers to use two different taxonomies: IAB-2 (Interactive Advertising Bureau) and documents taxonomy.

IAB-2 text classification

IAB-2 categories texts into multiple topics and then identifies text based on the depth level. Call Classify method with a text as parameter to perform classification.

This text will be classified as Healthy_Living (IAB-2). Some more examples:

  • Sooner or later technology will overcome labor work – Technology_&_Computing (IAB-2)
  • This game has better graphics on Xiaomi Note 8 pro mobile – Video_Gaming (IAB-2)
  • We need groceries for the next month – Shopping (IAB-2)

Document taxonomy

Documents taxonomy is used to identify different document classes, such as Invoices, CVs, Forms, emails. Call Classify method for “document.pdf” file in the current directory with IAB-2 taxonomy and return 2 best results.

Call Classify method for “document.doc” file with Documents taxonomy, set precision/recall balance to “Precision” and return 4 best results.

API also facilitates classification of password-protected documents.

Below are some helpful resource for you

We’d recommend you to explore these resources, evaluate API and if there is any issue, you can raise it on forum.

Posted in GroupDocs.Classification Product Family | Tagged , , , | Leave a comment

View Contents of ZIP and TAR Archives using GroupDocs.Viewer for Java 19.11

Java ZIP TAR Viewer

We are excited to bring a major release of GroupDocs.Viewer for Java API packaging a bunch of new features, improvements, and bug fixes. In the latest release, we have added the support of viewing archives and a couple of code files as well as provided the features of working with security settings in the PDF documents. So let’s walk through the latest release of our document viewer API for Java and check out what you are going to get after upgrading to v19.11.

View ZIP and TAR Archives

The first and foremost feature of v19.11 is viewing the list of files and folders in the ZIP and TAR archives. This feature is quite handy when you want to view the list of the contents without extracting the archives.

ZIP file is used to encase multiple files or folders as a single package that is further compressed to reduce the file size. Similarly, TAR is a Unix File Archive format used to archive the files and folders. In general, both ZIP and TAR are categorized as compression file formats.

In the following sections, you will see how to view a list of contents from the ZIP or TAR archives without extracting.

View List of Contents in ZIP or TAR Archives

When rendering an archive file as HTML, GroupDocs.Viewer returns an HTML page containing the list of items that are at the root of the archive. In the case of rendering as image or PDF, the API returns one or more pages depending on the number of items. The following code sample demonstrates this feature.

View List of Folders from ZIP or TAR

ZIP or TAR archives may contain multiple files and folders. These folders may further contain files as well as subfolders. GroupDocs.Viewer also allows viewing the folders that are located at the root of the archive. The following code sample shows how to get a list of folders from a ZIP or TAR archive.

View List of Subfolders within a Certain Folder of ZIP or TAR

There might be the case when you need to obtain the list of subfolders within a root folder in the ZIP or TAR archive. For such a case, you can specify the folder name using ArchiveOptions.setFolderName(“FolderName”) and the API will return the list of subfolders.

View List of Files within a Folder in ZIP or TAR

Now, once you have got the list of folders (and the subfolders as well), you can extract and view the items from your desired folder. The following code sample shows how to view the items of a specific folder in a ZIP or TAR.

For more details on rendering archives, please visit working with archives.

Working with Security Settings in PDF Documents

The PDF documents allow setting security parameters to restrict unauthorized access. The security can be enabled using:

  • Owner password – The password which is required to change document permissions.
  • User password – The password required to open the document.
  • PDF file permissions – The permissions to allow or deny printing, modification and data extraction.

In the latest release, we have added the feature of setting the above-mentioned security settings when rendering a file into a PDF document. The following code sample shows how to set the owner password, user password, and the permissions to deny the printing.

Detecting Security Settings in PDF Document

You can also check the security settings that are applied to a particular PDF document. For example, you can check if printing of the document is allowed or not as shown in the following code sample.

Support for Viewing Code Files

In addition to the support of ZIP and TAR files, we have also added the feature of viewing C# (.cs) and Visual Basic (.vb) code files.

Bug Fixes

The following is the list of bugs that are fixed in v19.11.

  • Output extension is empty when saving HTML page into cache
  • Object null reference exception when rendering DWG document
  • The Watermark opacity is set twice when rendering as HTML
  • The separator is wrong for the opacity value
  • File is corrupted or damaged exception for presentation documents
  • Unable to render .xls file with exception “file is corrupted or damaged”


We have made the following improvements in v19.11.

  • Improved performance for rendering PSD image format into PDF
  • Improved rendering Dicom, Dng and WebP formats into PDF
  • Extended support for CellsOptions.setTextOverflowMode option for rendering the document into image
  • Extended support for CellsOptions.setTextOverflowMode option for rendering into PDF
  • Rendering contact photo from vCard file format (VCF)
  • Improve output for rendering zip archives

Well, this was a brief overview of the major features as well as improvements and bug fixes. You can also visit the release notes of GroupDocs.Viewer for Java 19.11 to know about the public API changes. Visit the documentation of GroupDocs.Viewer for Java for more details and code samples of every feature. You can download/clone the source code examples from the GitHub repository.

In case you find any issue while using our API, we are always available to provide you free support on our forum.

Posted in GroupDocs.Viewer Product Family | Tagged , , , , , , , , | Leave a comment