Convert WebP to JPG, PNG, TIFF, and PDF in C#

In our previous post, we discussed WebP images and learned to convert WebP Images in Java. Today, in this article, we will learn to programmatically convert the WebP images into JPG, PNG, TIFF, and other formats using C#.

Convert WebP image to JPG, PNG or PDF formats in CSharp

First, we will have a look to convert the WebP images in the simplest way. Later we will convert with some customized options like tilt, flip, grayscale, resize, change gamma, contrast, and brightness, and add watermark to converted JPG images. Following are the quick links to topics:

Steps in this article and code samples are using GroupDocs.Conversion for NET. So please make sure to install the API from any of the following methods:

  • Install using NuGet Package Manager.
  • Download the DLL and reference it into the project.

Convert WebP to JPG, PNG & TIFF in C#

To convert the WebP images into other formats, use the Converter class. For the simple conversion, you can use the below-mentioned few lines of C# code. This example shows the quick conversion of a WebP image to a JPG file. Just follow the steps:

  • Instantiate the Converter object with the source WebP image.
  • Instantiate the Image Conversion Options using ImgeConvertOptions class and just set the Format to JPG.
  • Call the Convert method with the output file path and the conversion options.
// Convert WebP image to JPG, PNG, BMP or any other format in C#
using (Converter converter = new Converter("./Resources/groupdocs_conversion-brand.webp"))
{
    ImageConvertOptions options = new ImageConvertOptions
    { // Set the conversion format to JPG
        Format = ImageFileType.Jpg
    };
    converter.Convert(@"./Output/converted-image.jpg", options);
}

Here are the original WebP image and the converted JPG image that is converted using the above code:

WebP Image
WebP Image
Converted from WebP to JPG
Converted JPG Image

Using the same above code and by just changing the file format i.e. “ImageFileType.Jpg” and the output file name, you may easily convert your WebP files into JPEG, PNG, TIF, TIFF, BMP, etc.

This was the simple conversion, now let us convert with different effects.

Convert WebP to JPG, PNG, TIFF with Advanced Options in C#

Along with the conversion of WebP to other formats, we can also add effects while converting. Below are some of the effects like; convert to grayscale; flip images horizontally or vertically; rotate the image to any angle; resize the image to make it smaller or larger; change the contrast, brightness, gamma values; or even apply watermarks to the converted images.

Converted from WebP to JPG
WebP to JPG
Converted from WebP to JPG in Grayscale
Grayscale
Converted from WebP to JPG with Resize
Resize
Converted from WebP to JPG with Horizontal Flip
Flip
Converted from WebP to JPG with changed Contrast
Contrast
Converted from WebP to JPG with Watermark
Watermark
Converted from WebP to JPG with Rotation
Rotate
Converted from WebP to JPG with Changed Brightness
Brightness
Converted from WebP to JPG with Gamma Change
Gamma

Here is the code that is used to apply these effects. You may apply these effects one by one or in combination to get the desired results.

// Apply effects while converting WebP image to other formats in C#
using (Converter converter = new Converter("./Resources/groupdocs_conversion-brand.webp"))
{
    ImageConvertOptions options = new ImageConvertOptions
    {
        Format = ImageFileType.Jpg,
        Grayscale = true,   // Convert the image in Grayscale
        Height = 141,       // Resize the Image Height
        Width = 167,        // Resize the image Width
        FlipMode = ImageFlipModes.FlipX,    // Flip the image
        Contrast = 50,      // Change the contrast of image
        RotateAngle = 90,   // Rotate the image
        Brightness = 50,    // Change the brightness
        Gamma = 0.5F,       // Gamma Setting
        Watermark =         // Watermark Settings
        {
            Text = "GroupDocs",
            Width = 100,
            Height = 100,
            Background = false,
            Top = 70,
            Left = 90,
            RotationAngle = -45,
        }
    };
    converter.Convert(@"./Output/converted-with-options.jpg", options);
}

Convert WebP to PDF in C#

Along with the conversion of WebP images to other image file formats, we can also convert images to PDF format. Following 3 lines of code will do the trick and help you converting the WebP image to PDF format.

// Convert WebP to PDF in C#
using (Converter converter = new Converter("./Resources/groupdocs_conversion-brand.webp"))
{
    PdfConvertOptions options = new PdfConvertOptions();
    converter.Convert(@"./Output/converted-webp-image.pdf", options);
}

For more details and advance options to convert into PDF, you may visit the documentation.

See Also

There are many other open-source examples that are publicly available at GitHub Repository. Download the source code and quickly run the examples using the getting started guide. In case of any difficulty, look at the documentation or reach us at any time on the forum.

Posted in GroupDocs.Conversion Product Family | Tagged , , , | Leave a comment

Classify your Customer Feedback using Sentiment Analysis in C#

Suppose that you have the opportunity to receive text comments from your customers or some other source and you want to evaluate how positive they are. There is a way to analyze such comments called sentiment analysis. Sentiment analysis is based on a deep neural network model that is suitable for a wide range of tasks.

Sentiment Classification API for .NET

If you want to do sentiment analysis programmatically, GroupDocs.Classification serves that purpose for you. It implements a general-purpose sentiment classifier that can be used to evaluate the tonality of product reviews, shop reviews, application reviews, feedbacks, etc.

GroupDocs.Classification for .NET

This article will guide to classify the comments and analyze the positivity in C# using GroupDocs.Classification for .NET. So before you start, please make sure to install the API from any of the following methods:

  • Install using NuGet Package Manager.
  • Download the DLL and reference it into the project.

How to Classify Text using Sentiment Analysis in C#

To solve such a task we can use a general class named Classifier, or we can use the Sentiment Classifier which is a bit simpler and more lightweight class. Here are the steps:

  • Initialize the SentimentClassifier.
  • Call the PositiveProbability method of SentimentClassifier class and pass the text as a parameter that needs to be analyzed.
  • The PositiveProbability method will return the positivity ranging from 0 to 1.

Here is the C# code to find the tone of any statement using the sentiment classification. We have chosen the following sentiment as an example:

“Experience is simply the name we give our mistakes”

// Analyze the positivity of text using sentiment classifier in C#.
var sentiment = "Experience is simply the name we give our mistakes";
var sentimentClassifier = new SentimentClassifier();
/// PositiveProbability method returns the positive probability of the sentiment.
var positiveProbability = sentimentClassifier.PositiveProbability(sentiment);
Console.WriteLine($"Positive Probability of the sentiment { positiveProbability }");
Positive Probability of the sentiment: 0.1118

Any value greater than 0.5 means the sentiment is positive and the range between 0 and 0.5 shows that it is negative.

Now according to the extracted positivity, you may get the Best Class for that sentiment and probability of that Best Class. We found that its positive probability is 0.11, so it should be classified as a negative comment and its Best Class should be Negative instead of Positive.

So what would be its Best Class Probability? Yes, it will be 0.89. Now let see in the code:

var sentiment = "Experience is simply the name we give our mistakes";
/// Classify method returns ClassificationResult object with the best class probability and name.
var response = sentimentClassifier.Classify(sentiment);
Console.WriteLine($"Best Class Name: {response.BestClassName}");
Console.WriteLine($"Best Class Probability: { response.BestClassProbability}");
Best Class Name: Negative
Best Class Probability: 0.8882

Classify Multiple Comments using Sentiment Analysis in C#

Normally we have thousands of comments and feedback, so how could we analyze our customer’s feedback? It is simple, just put the feedbacks in an array. Let the string array be the source of review. It also could be a file or the parsed response from a database or service. We can transform the string array to the float array of positive sentiment probabilities.

var sentiments = new string[] {
                "Now that is out of the way, this thing is a beast. It is fast and runs cool.",
                "Experience is simply the name we give our mistakes",
                "When I used compressed air a cloud of dust bellowed out from the card (small scuffs and scratches).",
                "This is Pathetic."
            };
            var classifier = new GroupDocs.Classification.SentimentClassifier();
            var sentimentPositivity = sentiments.Select(x => classifier.PositiveProbability(x)).ToArray();
            Console.WriteLine(string.Join("\n", sentimentPositivity));
0.8959 - "Now that is out of the way, this thing is a beast..."
0.1118 - "Experience is simply the name we give our mistakes"
0.1252 - "When I used compressed air a cloud ..."
0.0970 - "This is Pathetic."

What can we do with target sentiments? We can measure mean or median sentiment for the target product, shop, etc. Select the worst values and respond to the customers. We can also do analysis like finding inconsistencies between the positive probability value of a product and its rating.

I hope you find it useful. You can find more on classification from the mentioned resources.

Learn more about the Classification API

Downloading and running GitHub examples is the best and easiest way to get started.

Posted in GroupDocs.Classification Product Family | Tagged , , , | Leave a comment

Search Text in Word, Excel, PDF, ZIP and other Document Formats using C# .NET

Full text search of documents

We often need a full-text search API that enables our applications to search through documents for particular information specified as a textual search query. The documents can be of any format such as Word (Doc, Docx), PDF, HTML, EPUB, Spreadsheet (XLS, XLSX), Presentation (PPT, PPTX), images, and videos.

GroupDocs.Search is a powerful full-text search API that allows you to search through over 70 document formats in your applications. To make it possible to search instantly across thousands of documents, they must be added to the index.

Why Use GroupDocs.Search as a Developer?

  • No additional software is required to search through documents of supported formats.
  • Great variety of indexing and search options are provided to meet any requirements.
  • A wide selection of search types is available in text or object form queries.
  • High indexing and search performance is achieved by unique algorithms and data structures, optimizations and multi-threaded execution.
  • Various ways of visualizing search results in the text of documents are supported.

Please check About Search Engines article to know what place GroupDocs.Search API occupies in the classification of search engines.

Installation

GroupDocs.Search for .NET is hosted on NuGet and can easily be installed using the NuGet Package Manager. Alternatively, you can download the API’s DLL from the Downloads section.

Search Through Office Documents using C#

The following steps explain how to search words or phrases in multiple documents (Word, Excel, PDF and other document formats).

  • Create a new index: First of all, you need to create an index. An index can be created in memory or on disk. An index created in memory cannot be saved after exiting your program. In contrast, an index created on disk may be loaded in the future to continue working. Details on creating an index are described in the section Creating an index.
  • Subscribe to index events: After creating an index, you need to add documents to the index for indexing. Indexing documents can be successful or unsuccessful for various reasons, for example, due to read errors from the disk or the presence of a password to access a document. To receive information about indexing errors, you can subscribe to the ErrorOccurred event. To work with events, see the section Search index events.
  • Index Documents: Document indexing can be performed synchronously or asynchronously. Synchronous indexing means that a thread that started the indexing process will be busy until the operation is completed. However, more often, it is necessary to perform indexing asynchronously, with the ability to execute other tasks in the thread that launched the operation. A detailed description of all aspects of the indexing process is provided in section Indexing.
  • Perform search: When documents are indexed, the index is ready to handle search queries. The following types of search queries are supported: simple, fuzzy, case sensitive, boolean, phrasal, faceted, with wildcards, and others. Description of search queries of various types is presented in the section Searching.
  • Use search results: When a search is completed, you need to somehow interpret a result. The result can be represented by a simple list of documents found, or the words and phrases found can be highlighted in the text of the document. For more information on processing search results, see Search results.
string indexFolder = @"/Users/muhammadsohailismail/MyIndex/"; // Specify the path to the index folder
string documentsFolder = @"/Users/muhammadsohailismail/MyDocuments/"; // Specify the path to a folder containing documents to search

// a) Create new index or
// b) Open existing index
Index index = new Index(indexFolder);

// c) Subscribe to index events
index.Events.ErrorOccurred += (sender, args) =>
{
    Console.WriteLine(args.Message); // Writing error messages to the console
};

// d) Add files synchronously
index.Add(documentsFolder); // Synchronous indexing documents from the specified folder

// f) Perform search
string query = "Worthy"; // Specify a search query
SearchResult result = index.Search(query); // Searching in the index

// g) Use search results
// Printing the result
Console.WriteLine("Documents found: " + result.DocumentCount);
Console.WriteLine("Total occurrences found: " + result.OccurrenceCount);
for (int i = 0; i < result.DocumentCount; i++)
{
    FoundDocument document = result.GetFoundDocument(i);
    Console.WriteLine("\tDocument: " + document.DocumentInfo.FilePath);
    Console.WriteLine("\tOccurrences: " + document.OccurrenceCount);
}

// Highlight occurrences in text
if (result.DocumentCount > 0)
{
    FoundDocument document = result.GetFoundDocument(0); // Getting the first found document
    string path = @"/Users/muhammadsohailismail/Output/Highlighted.html";
    OutputAdapter outputAdapter = new FileOutputAdapter(path); // Creating the output adapter to a file
    HtmlHighlighter highlighter = new HtmlHighlighter(outputAdapter); // Creating the highlighter object
    index.Highlight(document, highlighter); // Generating output HTML formatted document with highlighted search results

    Console.WriteLine();
    Console.WriteLine("Generated HTML file can be opened with Internet browser.");
    Console.WriteLine("The file can be found by the following path:");
    Console.WriteLine(Path.GetFullPath(path));
}

The above code generates the following output and HTML file.

Search in Fields of Documents using C#

Faceted search is filtering of search results by setting valid document field names to search. Faceted search allows you to search only in certain fields of documents, for example, only in the content field or in the file name field. A simple faceted search example is presented below with queries in text and object form.

string indexFolder = @"c:\MyIndex\";
string documentsFolder = @"c:\MyDocuments\";
 
// Creating an index in the specified folder
Index index = new Index(indexFolder);
 
// Indexing documents from the specified folder
index.Add(documentsFolder);
 
// Search in the content field with text query
SearchResult result1 = index.Search("content: Einstein");
 
// Search in the content field with object query
SearchQuery wordQuery = SearchQuery.CreateWordQuery("Einstein");
SearchQuery fieldQuery = SearchQuery.CreateFieldQuery(CommonFieldNames.Content, wordQuery);
SearchResult result2 = index.Search(fieldQuery);

Using format specific fields

For each document format, there are standard fields that may be present in documents of this type. The library provides the following classes containing constants with the names of standard document fields: EpubFieldNames, FictionBookFieldNames, MailFieldNames, PresentationFieldNames, SpreadsheetFieldNames, WordsFieldNames.

There are also fields that may be present in documents of any type. The names of such fields are represented in the CommonFieldNames class.

An example of using standard field names of documents is presented in the following example.

string indexFolder = @"c:\MyIndex\";
string documentsFolder = @"c:\MyDocuments\";
 
// Creating an index in the specified folder
Index index = new Index(indexFolder);
 
// Indexing documents from the specified folder
index.Add(documentsFolder);
 
// Search in the content field with text query
string query1 = WordsFieldNames.Company + ": Dycum";
SearchResult result1 = index.Search(query1);
 
// Search in the content field with object query
SearchQuery wordQuery = SearchQuery.CreateWordQuery("Dycum");
SearchQuery fieldQuery = SearchQuery.CreateFieldQuery(WordsFieldNames.Company, wordQuery);
SearchResult result2 = index.Search(fieldQuery);

Detailed information about faceted search is presented on the page Faceted search.

Conclusion

This article has explained how to search through documents (DOCX, PDF, Excel, Text files) for particular information in C#. It also explained how to search in the fields of documents. GroupDocs.Search contains several other features, please check the documentation to learn more about it.

Posted in GroupDocs.Search Product Family | Tagged , , , | Leave a comment

Text Indexing and Search your Directories using C#

Using the .NET API, you can perform searching by parts and specify the number of search threads in C#. This feature will be more beneficial when you search text in large indexes that contain thousands of documents. Furthermore, you can now get the start & end time, and the total search time to get the search results.

Following code snippet shows how to create an index and then search text in chunks from the mentioned folder in C# using GroupDocs.Search for .NET. To utilize the best performance, and updated features, I would recommend you install and use the latest version of API.

Search Text by Indexing in C#

The following example shows how to perform the search by parts/chunks.

  • Create the Index with your index folder.
  • Add your documents folder in the created index.
  • Set the Search Option and set your IsChunkSearch to true for search by chunk/parts
  • Call the Search method of your index by providing your search query and searching options.
  • Now in the result, you may traverse each segment using Search Next and passing it Chunk Search Token as a parameter.
string indexFolder = @"c:\MyIndex\";
string documentsFolder = @"c:\MyDocuments\";
string query = "Einstein";
// Creating an index in the specified folder
Index index = new Index(indexFolder);
// Indexing documents from the specified folder
index.Add(documentsFolder);
// Creating a search options instance
SearchOptions options = new SearchOptions();
options.IsChunkSearch = true; // Enabling the search by chunks
// Starting the search by chunks
SearchResult result = index.Search(query, options);
Console.WriteLine("Document count: " + result.DocumentCount);
Console.WriteLine("Occurrence count: " + result.OccurrenceCount);
// Continuing the search by chunks
while (result.NextChunkSearchToken != null)
{
    result = index.SearchNext(result.NextChunkSearchToken);
    Console.WriteLine("Document count: " + result.DocumentCount);
    Console.WriteLine("Occurrence count: " + result.OccurrenceCount);
}

For any suggestions, confusions, or queries related to the .NET Search API, you may use the forum for a quick response. You may quickly create a thread to share your thoughts.

Posted in GroupDocs.Search Product Family | Tagged , , , , | Leave a comment

Split or Merge PDF, Word, Excel Documents in Java

Worried about merge or split documents of various types in multiple platforms? There could be many statements in your mind:

  • How to merge PDF documents together in Java?
  • Want to split word documents, or merge excel spreadsheets.
  • What to do if I need to merge PPT/PPTX presentations.
  • Many more questions, the list may not end.
Split or Merge PDF, Word, Excel documents in Java
GroupDocs.Merger for Java

GroupDocs provides a document merging solution for all such requirements. It’s Java API allows you to merge documents and manipulate document structure in Java across a wide range of supported document formats. It further allows manipulating document pages, page transformations, information extraction from the documents, generating previews, and much more.

In this article, we will look a bit about the following topics :

The code sample and steps explained below are using GroupDocs.Merger for Java so you may download or integrate it into your maven-based applications with pom.xml configurations.

Merge PDF files in Java

We can combine two or more PDF files in just a few lines of code. Below is the code snippet from the examples, that is self-explanatory and needs no further clarification, hence shows how to merge multiple PDF documents in Java. Steps are very simple if you have done deciding the documents to join together:

  • Instantiate Merger object, with the first document with which other documents are to be merged.
  • Call join method, passing the document to merge.
  • Recall join method to merge more documents.
  • Call save method to save the final output.
  • That’s it.
// Set paths for the documents to join together in a single file.
String filePath1 = "document-1.pdf";
String filePath2 = "document-2.pdf";
String filePath3 = "document-3.pdf";
// Merger multiple PDF documents into a single PDF file.
Merger merger = new Merger(filePath1 );
merger.join(filePath2 ); // Joining 2nd Document
merger.join(filePath3 ); // Joining 3rd Document
// Save the merged document.
String filePathOutput = "mergedDocument.pdf";
merger.save(filePathOutput);

Merge Excel, Word, PowerPoint Documents in Java

You can combine multiple Word documents, Excel Spreadsheets, PowerPoint presentations, in fact, almost any documents of the same format. The above code of joining PDF documents can be used to merge a wide variety of documents. At the bottom of the article, I will mention the list of file formats that can be merged with the same code. Here for an example, I am showing how similarly, more than two Word documents can be combined together into a single Word file in just a few lines of Java code.

// Merger multiple Word documents into a single DOCX file.
Merger merger = new Merger("document1.docx" );
merger.join("document2.docx" ); // Joining 2nd Document
merger.join("document3.docx" ); // Joining 3rd Document
// Save the merged document.
merger.save("mergedDocument.pdf");

Merge Document Pages in Java

Multiple documents can be merged by selective pages and also by specifying the desired page range. Your code will remain similar to the mentioned above, just a little change while setting your merging options using JoinOptions class.

Below is the source code snippet that shows how to merge documents by specifying certain pages.

// Set the start and end page number in JoinOptions class.
JoinOptions joinOptions = new JoinOptions(1, 2);
// Merge two files with selective pages using join method.
Merger merger = new Merger("document-1.docx");
merger.join("document-2.docx" , joinOptions);
merger.save("merged-Document.docx");

Split Documents into Multiple Documents in Java

Just like we have merged documents above, we can also split Word documents, Excel spreadsheets, presentations, PDF files, and many other documents quickly in different ways.

  • Split by exact page numbers
  • Split a document to several multi-page documents
  • Split by page range
  • Split by Even and Odd pages

Split by Exact Page Numbers

We can split a document by providing the exact number of pages in Java. The following code will split a PDF file into 3 documents, each having the mentioned single page.

  • Initialize the PageSplitOptions object with output file and mode to split.
  • Instantiate the Merger object with the source file or stream to split.
  • Call the split method to split the provided document and get it saved.
String filePath = "document.pdf";
String filePathOut = "document_{0}.{1}";
// Split the document into multiple single page documents.
PageSplitOptions splitOptions = new PageSplitOptions(filePathOut, new int[] { 3, 6, 8 });
Merger merger = new Merger(filePath);
merger.split(splitOptions);

Split Document into Multipage Documents

If you have a document with 6 pages, the below mentioned little modification in the above code will split your document into 3 separate documents in the following manner:

Document NamePage Numbers
document_11, 2
document_23, 4, 5
document_36
PageSplitOptions splitOptions = new PageSplitOptions(filePathOut,  PageSplitMode.Interval, new int[] { 3, 6 },);

Split by Start & End Page Range

If you want to split any document by just providing the page range, here is how a Powerpoint presentation can be split into 3 single page presentations.

String filePath = "presentation.ppt";
String filePathOut = "presentation_{0}.{1}";
// Split the presentation into multiple single page presentations.
PageSplitOptions splitOptions = new PageSplitOptions(filePathOut, 3, 5);
Merger merger = new Merger(filePath);
merger.split(splitOptions)

Split by Even or Odd Page Ranges

You can set the even and odd page ranges to split. Following PageSplitOptions will allow splitting the provided document into multiple one-page documents for odd pages in the range of 3 to 8.

PageSplitOptions splitOptions = new PageSplitOptions(filePathOut, 3, 8, RangeMode.OddPages);

Documents that can be Split of Merged

As promised, here is the list of document formats that can be merged or split with the above examples. You may visit docs anytime to check the updated list.

Document TypeFile Formats
Word ProcessingDOC, DOCX, DOCM, DOT, DOTX, DOTM, ODT, OTT, RTF, TXT
SpreadsheetsXLS, XLSX, XLSM, XLSB, XLT, XLTX, XLTM, ODS, CSV, TSV
PresentationsPPT, PPTX, PPS, PPSX, ODP, OTP
DrawingsVSDX, VSDM, VSSX, VSSM, VSTX, VSTM, VDX, VSX, VTX
WebHTML, MHT
Page Description LanguagesTEX, XPS
eBooks & OthersPDF, EPUB, ONE

Good to see you here, you can freely contact us on the forum in case you feel any difficulty or have some confusion or want to give some good suggestions.

Posted in GroupDocs.Merger Product Family | Tagged , , , , , | Leave a comment

Insert OLE Objects in Word, Excel, PowerPoint with C#

OLE stands for Object Linking and Embedding. It is provided by Microsoft and allows you to create and edit documents containing items or objects that are created by various applications.

As an example, you can embed spreadsheets, images, and sound clips as OLE objects in a Word document. You can use these OLE objects in the Word document and do not worry about switching to multiple applications again and again. You can embed or insert such objects programmatically using OLE in C#.

This article will guide you about how you can:

Steps in this article and code samples are using GroupDocs.Merger for NET. So please make sure to install the API from any of the following methods:

  • Install using NuGet Package Manager.
  • Download the DLL and reference it into the project.

Insert PDF as OLE Object into MS Word Document in C#

Insert PDF as OLE in Word Document in C#

Here are the steps and C# code sample to show how to embed a PDF file into a Word document as an OLE Object:

  1. Instantiate the OleWordProcessingOptions with embedding options and the document to embed in a Word document.
  2. Now instantiate the Merger object with the source Word document path or stream.
  3. Call the ImportDocument method and pass the object of OLE Word Processing Options that are set in step 1.
  4. That’s it. Call the Save method to get the resultant Word document having a PDF document as an OLE object.
// Embed a PDF file into a Word document as an OLE Object in C#
int pageNumber = 2;
OleWordProcessingOptions oleWordProcessingOptions = new OleWordProcessingOptions(@"embedded-doc.pdf", pageNumber)
{ 
    Width = 300, // Just setting the height & width, you have more options.
    Height = 300
};
// Use Merger class to start with source Word document and embed PDF as OLE object.
using (Merger merger = new Merger(@"source-doc.docx"))
{
    merger.ImportDocument(oleWordProcessingOptions);
    merger.Save(@"word-document-with-OLE.docx");
}

Insert Word Document as OLE Object into Excel Spreadsheet in C#

Insert Word File s OLE in Excel Spreadsheet in C#

We can embed OLE objects into Excel spreadsheets. CSharp Code sample and steps below explaining  how to add a Word document into an Excel spreadsheet as an OLE Object:

  1. Instantiate the OleSpreadsheetOptions with embedding options and the document to embed in an Excel spreadsheet.
  2. Now instantiate the Merger object with the source Spreadsheet path or stream.
  3. Now call the ImportDocument method and pass the object of OLE Spreadsheet Options that are set in step 1.
  4. Finally, call the Save method to get the resultant Excel Spreadsheet having a Word document as an OLE object.
// Embed a Word file into an Excel Spreadsheet as an OLE Object in C#
int pageNumber = 2;
OleSpreadsheetOptions oleSpreadsheetOptions = new OleSpreadsheetOptions(@"embedded-doc.docx", pageNumber)
{
    RowIndex = 2, // Setting the Row & height Index, you have more options.
    ColumnIndex = 2
};
// Using Merger class with source spreadsheet and embedding a Word document as an OLE object.
using (Merger merger = new Merger(@"sample-doc.xlsx"))
{
    merger.ImportDocument(oleSpreadsheetOptions);
    merger.Save(@"excel-sheet-with-ole.xlsx");
}

Add PDF as OLE Object to PowerPoint Presentation in C#

Insert PDF as OLE in PowerPoint Presentation in C#

Similarly, here we are inserting objects in a PowerPoint presentation.

  • Instantiate the OlePresentationOptions with embedding options and the document to embed in a PowerPoint presentation.
  • Now instantiate the Merger object with the source Presentation path or stream.
  • Call the ImportDocument method and pass the object of OLE Presentation Options that are set in step 1.
  • Finally, call the Save method to get the resultant PowerPoint presentation with a PDF document as an OLE object.
// Embed a PDF file into an Excel Spreadsheet as an OLE Object in C#
int pageNumber = 2;
OlePresentationOptions olePresentationOptions = new OlePresentationOptions(@"embedded.pdf", pageNumber)
{
    X = 10, // Setting only X & Y coordinates, you can customize more.
    Y = 10
};
// Using Merger class to embed a PDF file as an OLE object in the PowerPoint presentation.
using (Merger merger = new Merger(@"sample-presentation.ppt"))
{
    merger.ImportDocument(olePresentationOptions);
    merger.Save(@"powerpoint-presentation-with-ole.ppt");
}

Conclusion

We have discussed how easy and quickly we can insert OLE Objects in Word, Excel, or PowerPoint documents programmatically in C#. There is only a small difference in code for each objective i.e. different OLE options class and its options for each file format:

  • OleWordProcessingOptions to embed OLE objects in a Word document.
  • OleSpreadsheetOptions to embed OLE objects in Excel Spreadsheets.
  • OlePresentationOptions to embed OLE objects in PowerPoint presentation.

You can learn more about the API from the documentation or Let’s talk more @ Free Support Forum.

Posted in GroupDocs.Merger Product Family | Tagged , , , | Leave a comment

C# Diff Library for Comparing Text Files

The GroupDocs.Comparison for .NET is a C# library which allows you to compare documents and find differences. Compare and merge Microsoft Word, Excel, PowerPoint, OpenDocument, PDF, Text, HTML and many other documents, retrieve a list of changes between source and target document(s), apply or reject changes and save results with GroupDocs.Comparison API. In addition to this, GroupDocs.Comparison can identify styling and formatting changes – like bold, italic, underlines, strikethroughs, font types, etc.

Changes detection algorithms used by GroupDocs.Comparison allows to detect differences in different document parts and blocks:

  • Text blocks – paragraphs, words and characters;
  • Tables;
  • Images;
  • Shapes etc.

Here are simple steps to compare two text files and show differences: 

  • Instantiate Comparer object with source document path or stream;
  • Call Add method and specify the target document path or stream;
  • Call Compare method.

The following code snippet demonstrates the simplest case of documents comparison using a couple lines of code. 

Compare documents from local file

using (Comparer comparer = new Comparer(“source.docx”))
{
    comparer.Add(“target.docx”);
    comparer.Compare(“result.docx”);
}

Compare documents from the stream

using (Comparer comparer = new Comparer(File.OpenRead(“source.docx”)))
{
    comparer.Add(File.OpenRead(“target.docx”));
    comparer.Compare(File.Create(“result.docx”));
}

Let’s say you have two contracts in DOCX format that were concluded in different years. If you use the above code to compare these contracts, you get a DOCX file where the deleted elements are marked in red, the added in blue, and the modified in green as shown below:

Accept or Reject detected differences

GroupDocs.Comparison provides an ability to apply or discard specific changes between source and target documents and save the resultant document with (or without) selected changes.

The following are the steps to apply/reject changes to the resultant document.

The following code sample shows how to accept/reject detected differences.

using (Comparer comparer = new Comparer(“source.docx”))
{
    comparer.Add(“target.docx”);
    comparer.Compare();
    ChangeInfo[] changes = comparer.GetChanges();
    changes[0].ComparisonAction = ComparisonAction.Reject;
    comparer.ApplyChanges(File.Create(“result.docx”), new SaveOptions(), new ApplyChangeOptions() { Changes = changes });
}

Generate document pages preview

GroupDocs.Comparison allows to generate page previews for source, target and resultant document(s) using GeneratePreview method of a Document class.

PreviewOptions class is used to manage preview generation process – specify desired page numbers, image format etc.

The following are the steps to generate a document preview with GroupDocs.Comparison API:

  • Create a new instance of Comparer class and pass the source document path as a constructor parameter;
  • Add target document(s) to comparison using Add method;
  • Source and Targets properties of Comparer object allows to access source and target documents and provides GeneratePreview method;
  • Instantiate the PreviewOptions object with:
    • delegate for each page stream creation (see event handler CreatePageStream); 
    • image preview format – PNG / JPG / BMP;
    • page numbers to process;
    • custom size of preview images (if needed).
  • Call GeneratePreview method of Source and Targets document and pass PreviewOptions to it.

Get page previews for resultant document

using (Comparer comparer = new Comparer(“source.docx”))
{
    comparer.Add(“target.docx”);
    comparer.Compare(“result.docx”);
    Document document = new Document(File.OpenRead(“result.docx”));
    PreviewOptions previewOptions = new PreviewOptions(pageNumber =>
    {
        var pagePath = Path.Combine(“C:\”, $"result_{pageNumber}.png");
        return File.Create(pagePath);
    });
    previewOptions.PreviewFormat = PreviewFormats.PNG;
    previewOptions.PageNumbers = new int[] { 1, 2 };
    document.GeneratePreview(previewOptions);
}

Compare multiple documents

GroupDocs.Comparison allows comparing more than two documents. The following code sample shows how to compare multiple documents programmatically.

using (Comparer comparer = new Comparer(“source.docx”)
{
    comparer.Add(“target1.docx”);
    comparer.Add(“target2.docx”);
    comparer.Add(“target3.docx”);
    comparer.Compare(“result.docx”);
}

Installation

NuGet is the easiest way to download and install GroupDocs.Comparison for .NET. Please get a temporary license to test the library without any functional restrictions.

Please check the documentation to learn more about the library. We also offer free technical support so please feel free to contact us – we will be happy to help.

Posted in GroupDocs.Comparison Product Family | Tagged , , | Leave a comment

Show & Hide Page Borders while Converting Documents to HTML in C#

Convert DOCX to HTML in CSharp

Either you want to convert a document to HTML format to get the content for your website, or you have come across an online document submission website that requires documents to be submitted in HTML format. In either case, you need a DOC to HTML converter. However, if you need to convert your documents to HTML programmatically, then this article is for you only. This article will cover the following ways to convert documents to HTML in C#:

  • Simplest conversion of documents like DOCX to HTML in C#.
  • Convert to HTML with customized options.
  • Convert using the option to show or hide page borders.

Prerequisite – C# Document Conversion Library

GroupDocs.Conversion for .NET is an easy to use powerful API with the ability to convert any document from the wide list of supported document formats into any supported target formats. You may download the API from the downloads section or install it from NuGet.

Convert DOCX to HTML in C# – Simple

This is the simplest and very useful conversion. I better say that you can convert any of your documents to the HTML format. Just check your format from the supported formats list and go-ahead to get it converted.

Your document will be converted to HTML and the resultant document will be there in your repository. The following small code sample shows the conversion of a DOCX file into HTML using the Converter class in C#.

// Converting DOCX to HTML in C#
using (Converter converter = new Converter("document.docx"))
{
    MarkupConvertOptions options = new MarkupConvertOptions();
    converter.Convert("converted.html", options);
}

Convert to HTML with Customized Options

GroupDocs.Conversion provides different other options to get the desired conversion result. The customized options include:

  • Fixed Layout
  • Fixed Layout – Show Borders
  • Format
  • Page Number
  • Pages
  • Pages Count
  • Use PDF
  • Watermark
  • Zoom

You may visit the documentation or GitHub samples to see each option in detail. I will show some of the customizations while again converting the DOCX to HTML format in below code sample.

// Converting DOCX to HTML in C# with advance options.
using (Converter converter = new Converter("document.docx"))
{
    MarkupConvertOptions options = new MarkupConvertOptions
    { // Setting customized options
        PageNumber = 2,
        PagesCount = 1,
        FixedLayout = true
    };
    converter.Convert("converted.html", options);
}

Convert to HTML – Show or Hide Page Borders

Last but not least, you can now control the visibility of page borders while converting documents to HTML in C#. The GroupDocs.Conversion for .NET – Release v20.3 gives this control to the C# programmers along with some other new features and improvements. The below example shows that by setting the FixedLayoutShowBorders property of MarkupConvertOptions class to true or false, you can show or hide the page borders in the resultant HTML document.

// Converting DOCX to HTML in C# with show or hide borders control.
using (Converter converter = new Converter("document.docx"))
{
    MarkupConvertOptions options = new MarkupConvertOptions
    {
        PageNumber = 2,
        FixedLayout = true,
        PagesCount = 1,
        FixedLayoutShowBorders = false
    };
    converter.Convert("converted.html", options);
}

Images below showing the original DOCX document and the converted HTML with and without page borders.

Docx document to convert into HTML
Original DOCX Document
HTML File with page borders and no borders.
The above figure shows the HTML files that are converted from DOCX with show borders and do not show borders options.

See more about GroupDocs.Conversion

Let’s talk more @ Free Support Forum.

Posted in GroupDocs.Conversion Product Family | Tagged , , , , | Leave a comment