Get List of Indexed Documents using GroupDocs.Search for .NET 18.9

GroupDocs.Search for .NETWe are pleased to announce the monthly release of GroupDocs.Search for .NET 18.9.  Using the latest version, you can now get the list of indexed documents and document’s text from the index archive. Moreover, you can now save encodings automatically which were used to extract text from TXT files.  We would recommend you to install and use the latest version of the API.

Enhancements

Following are the enhancements introduced in the latest version:

Get List of Indexed Documents

Using GroupDocs.Search API you can get a list of indexed documents and items from container documents like ZIP archives, OST and PST files. This example shows how to get a list of indexed documents from an index:


string indexFolder = @"c:\MyIndex";
  
// Creating index from existing folder
Index index = new Index(indexFolder);
  
// Getting list of indexed documents
DocumentInfo[] documents = index.GetIndexedDocuments();
  
// Getting items of container document
DocumentInfo[] items = index.GetIndexedDocumentItems(documents[0]);

For more details on this feature, please visit this documentation article.

Extract Document Text

Using the latest verion, you can now extract document text from the index archive or from a document directly if archiving is not used. An extracted text can be used to check an encoding that is used for indexing text documents. Also, it can be used for quick manual checking of the presence or absence of any words in documents. This example shows how to extract document text from the index:


string indexFolder = @"c:\MyIndex";
   
// Creating index from existing folder
Index index = new Index(indexFolder);
   
// Getting list of indexed documents
DocumentInfo[] documents = index.GetIndexedDocuments();
  
// Extracting HTML formatted document text
string htmlText = index.ExtractDocumentText(documents[0], null);

This example shows how to extract document text to a file:


string indexFolder = @"c:\MyIndex";
   
// Creating index from existing folder
Index index = new Index(indexFolder);
   
// Getting list of indexed documents
DocumentInfo[] documents = index.GetIndexedDocuments();
  
// Extracting HTML formatted document text to a file
index.ExtractDocumentText(@"c:\DocumentText.html", documents[0], null);

For more details on this feature, please visit this documentation article.

Save Encodings Automatically

GroupDocs.Search for .NET 18.9 implements automatic saving of encodings which were used to extract text from TXT files. In practice, this means that there is no longer any need to provide an encoding when generating document text with highlighted found words. This example shows how to generate HTML formatted text with highlighted found words:


string indexFolder = @"c:\MyIndex";
string documentFolder = @"c:\MyDocuments";
  
// Creating index
Index index = new Index(indexFolder);
  
// Subscribing to file indexing event
index.FileIndexing += (sender, args) =>
{
    // Setting encoding for each text file during indexing
    args.Encoding = Encodings.windows_1251;
};
  
// Adding text documents encoded in windows-1251 to index
index.AddToIndex(documentFolder);
  
// Searching for word 'человеческий'
SearchResults results = index.Search("человеческий");
  
// Generating HTML formatted text with highlighted found words
// There is no need to provide the encoding again - it is saved in the index
string htmlText = index.HighlightInText(results[0]);

For more details on this feature, please visit this documentation article.

Available Channels and Resources

Here are a few channels and resources for you to download, try, learn and get technical support on GroupDocs.Search:

Feedback

If you have any suggestions, questions, or queries related to the GroupDocs.Search for .NET API, we will be happy to hear from you. Just create a forum thread to share your thoughts.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Search Product Family | Tagged , , | Leave a comment

Announcing Hotfix release of GroupDocs Signature for .NET 18.9.1

GroupDocs.Signature

This blog post covers the hotfix introduced in GroupDocs.Signature for .NET 18.9.1. In previous version of the API there was a issue while signing PDF with Metadata Signatures. But now this issue has been fixed. We therefore, recommend you to download the new release and enhance document e-signing experience.

Bug Fix

  • Exception is thrown while signing PDF with Metadata Signatures

Available Channels and Resources

Here are a few channels and resources for you to learn, try and get technical support on GroupDocs.Signature API for .NET:

Feedback

As always, you are welcome to share your feedback to improve this product. We will be happy to know your thoughts. Just create a forum thread and our dedicated support team will be there to respond.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
Posted in GroupDocs.Signature Product Family | Tagged | Leave a comment

Introducing Metered License Support in GroupDocs.Editor for Java 18.9

GroupDocs Editor for Java

We are glad to announce another monthly release of GroupDocs.Editor for Java 18.9. This release comes up with multitude of new features. Now, you can manipulate documents by applying Metered License. Moreover, improvements like security update and few fixes are also introduced in this release. We’d recommend you to download latest version of the API and share your feedback.

Features

  • Following features for Cell documents are introduced:
    • Conversion to HTML format
    • Generation from input HTML
    • Ability to specify a separator
  • Opening encrypted documents with password
  • Encrypting output Cells document with setting a password
  • Support of Metered license system
  • Generate password-protected Cells and Words document
  • Additional parameters when processing text-based spreadsheet
  • Adjust memory usage during Cells and Words document processing
  • ExcludeHiddenWorksheets option
  • PDF standards compliance level when generating PDF from HTML
  • Improvements

    • Processing of multiple consequent spaces in Words processing module for round-trip scenarios
    • Space processing for bidirectional text
    • List processing in round-trip scenarios
    • Security improvements update

    Bug Fixes

    • Length and Resolution parsing modules
    • ArgumentException with sample document

    Available Channels and Resources

    Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.editor:

    Feedback

    As always, if you have any questions or suggestions, feel free to write on our forum.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Editor Product Family | Tagged | Leave a comment

    Introducing Metadata Signatures in GroupDocs.Signature for .NET 18.9

    GroupDocs.Signature

    We are delighted to announce another monthly release of GroupDocs.Signature for .NET 18.9 with multitude of new features like ability to save Image documents as PDF and signing of PDF with Metadata Signatures. Furthermore, this monthly release also comes up with few fixes. We therefore, recommend you to download the new version of API and evaluate the exciting features to enhance document e-signing experience.

    Features

    Metadata Signatures for PDF

    The Metadata Signature is the additional document property that contains special attributes/tags to keep non visual information inside the Document.
    Following example demonstrates how to compose Metadata Signature options for PDF Document:

    PdfMetadataSignOptions result = new PdfMetadataSignOptions();
    result.MetadataSignatures.Add(new PdfMetadataSignature("Author", "Mr.Sherlock Holmes"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("CreationDate", DateTime.Now));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Creator", "Dr.Whatson"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("ModDate", DateTime.Now));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Producer", "BakerStreet.Inc"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Subject", "Baskervalley"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Title", "OfficeDocument"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Trapped", "Information"));
    result.MetadataSignatures.Add(new PdfMetadataSignature("IsSigned", true));
    result.MetadataSignatures.Add(new PdfMetadataSignature("SignatureId", 112233));
    result.MetadataSignatures.Add(new PdfMetadataSignature("Amount", 123.456));
    return result;
    

    Search Metadata Signature

    Users of this API can search for Metadata Signatures within the Document.
    Following example demonstrates how to search Metadata Signatures in PDF Document:

    // setup search options
    PdfSearchMetadataOptions searchOptions = new PdfSearchMetadataOptions();
    // search document
    SearchResult result = handler.Search("SignedMetadata.pdf", searchOptions);
    // output signatures
    foreach (BaseSignature signature in result.Signatures)
    {
        PdfMetadataSignature metadataSignature = signature as PdfMetadataSignature;
        if (metadataSignature != null)
        {
            Console.WriteLine("Pdf Metadata: {0}:{1}  = {2}", metadataSignature.TagPrefix, metadataSignature.Name, metadataSignature.ToString());
        }
    }
    

    Metadata Signature Entity and Collection

    Users of this API can also add collection of Metadata Signatures in Document.
    Following example demonstrates how to add collection of Metadata Signatures in PDF Document:

    // setup options with text of signature
    PdfMetadataSignOptions signOptions = new PdfMetadataSignOptions();
    // Specify different Metadata Signatures and add them to options sigature collection
    // setup Author property
    PdfMetadataSignature mdSign_Author = new PdfMetadataSignature("Author", "Mr.Scherlock Holmes");
    signOptions.MetadataSignatures.Add(mdSign_Author);
    // setup data of document id
    PdfMetadataSignature mdSign_DocId = new PdfMetadataSignature("DocumentId", Guid.NewGuid().ToString());
    signOptions.MetadataSignatures.Add(mdSign_DocId);
    // setup data of sign date
    PdfMetadataSignature mdSign_Date = new PdfMetadataSignature("SignDate", DateTime.Now, "pdf");
    signOptions.MetadataSignatures.Add(mdSign_Date);
    // sign document
    string signedPath = handler.Sign("test.pdf", signOptions,
        new SaveOptions { OutputType = OutputType.String, OutputFileName = "Pdf_Documents_Metadata" });
    Console.WriteLine("Signed file path is: " + signedPath);
    

    Save Image Documents as PDF

    Image Documents can be saved as PDF using this latest release of the API.

    MatchType for Text Verification Options

    Now API provides ability to verify Text Signatures with extended option MatchType

    Bug Fixes

    • Incorrect signing image documents with .psd, .wmf and .svg format
    • Output PDF incorrectly signed with Digital Certificates
    • Unable to search Digital signature in Cells with extended options

    Available Channels and Resources

    Here are a few channels and resources for you to learn, try and get technical support on GroupDocs.Signature API for .NET:

    Feedback

    As always, you are welcome to share your feedback to improve this product. We will be happy to know your thoughts. Just create a forum thread and our dedicated support team will be there to respond.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Signature Product Family | Tagged | Leave a comment

    Extract Text from Databases using GroupDocs.Parser for .NET 18.9

    GroupDocs.Parser for .NET

    GroupDocs.Parser for .NET 18.9 has been released! The latest version allows you to extract text from the databases. You can also extract data from the form fields in a PDF document. Please continue to read for more details on the features introduced in v18.9.

    Features Introduced

    Extracting Text from Databases

    You can now extract text from the databases. To extract text from databases DbContainer class is used that implements IContainer interface. Each data table is represented by the entity. The content of the entity is CSV-presentation of the data table.

    For a working example, please refer to the following article:

    Extracting Data from the Form Fields in PDF Documents

    This feature allows extracting data from PDF forms. It is very useful when you are only concerned with the data in the forms of the PDF document. GetFormData method of PdfTextExtractor class is used for this purpose.

    For a working example, please refer to the following article:

    Available Channels and Resources

    Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.Parser:

    Have Queries?

    If you have got any queries or concerns about the API, please feel free to get in touch with us over the forum. We’ll be glad to address your concerns.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Parser Product Family | Tagged , , , , , | Leave a comment

    Extract Data from PDF Forms using GroupDocs.Parser for Java 18.9

    GroupDocs.Parser for Java

    We are pleased to announce that version 18.9 of GroupDocs.Parser for Java has been released. The latest version allows you to extract data from the form fields in a PDF document. It also supports text analysis API for spreadsheet, presentation and text documents. Furthermore, you can now extract text from the databases. Please see the release notes for more details.

    Features Introduced

    Extracting Data from the Form Fields in PDF Documents

    This feature allows extracting data from PDF forms. It is very useful when you are only concerned with the data in the forms of the PDF document. The getFormData method of PdfTextExtractor class is used for this purpose.

    For a working example, please refer to the following article:

    Extracting Text from Databases

    You can now extract text from the databases. To extract text from databases DbContainer class is used that implements IContainer interface. Each data table is represented by the entity. The content of the entity is CSV-presentation of the data table.

    For a working example, please refer to the following article:

    Text Analysis API

    GroupDocs.Parser allows extracting text areas from the pages of a document. This feature may help you getting data for text analysis. DocumentContent abstract class provides API to extract text areas from document pages. This feature has been extended for spreadsheets, presentations and text documents in the current version of the API.

    For a working example, please refer to the following article:

    Requesting the Password for Protected Documents

    We have two ways to provide a password for the protected documents. When the password is known, Password property of LoadOptions class is used. If it is not known whether it is protected or not before opening the document, PasswordProvider property of LoadOptions class is used. This feature allows providing a password for protected documents on-demand. IPasswordProvider interface is used for this purpose.

    For a working example, please refer to the following article:

    Breaking Change

    Since version 18.9, the security of Metered licensing has been improved. Metered licensing is now applicable only in Java runtime version 8u101 or above. Please use other types of licensing if you are using v18.9 or above in Java 7.

    Available Channels and Resources

    Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.Parser:

    Feedback

    As always, if you have any questions or suggestions, feel free to write on our forum.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Parser Product Family | Tagged , , , , , , | Leave a comment

    Optimize Memory Usage using GroupDocs.Editor for .NET 18.9

    GroupDocs Editor for .NET

    We are pleased to announce another monthly release of GroupDocs.Editor for .NET 18.9 with many new features. The main features introduced in this release are ability to adjust memory usage and generation of password protected documents. Furthermore, security improvement and few bugfixes are also introduced in this release. We’d recommend our users to download latest API and get benefits from new and enhanced API features.

    Features

    • Generate password-protected Cells and Words document
    • Additional parameters when processing text-based spreadsheet
    • Adjust memory usage during Cells and Words document processing
    • ExcludeHiddenWorksheets option
    • PDF standards compliance level when generating PDF from HTML

    Improvements

    • Security improvements update

    Bug Fixes

    • Length and Resolution parsing modules
    • ArgumentException with sample document

    Available Channels and Resources

    Here are a few channels and resources for you to download, learn, try and get technical support on GroupDocs.Editor:

    Feedback

    As always, if you have any questions or suggestions, feel free to write on our forum.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Editor Product Family | Tagged | Leave a comment

    Strengthen the Protection of Text Watermark in Presentation Documents using GroupDocs.Watermark for Java 18.8

    GroupDocs Watermark for .NETWe are pleased to introduce version 18.8 of GroupDocs.Watermark for Java. This version includes 2 new features, 1 enhancement, 1 bug fix, and a breaking change as well. It supports skipping unreadable characters during text watermark search. Furthermore, we have also added a new feature to strengthen protection of text watermark in PowerPoint documents. Please continue to read more about version 18.8.

    Features Introduced

    Skipping Unreadable Characters During Text Watermark Search

    There might be the case that the watermark’s text contains unreadable characters. The unreadable characters may affect the searching of the watermark. The latest version of GroupDocs.Watermark allows finding text watermark even if it contains unreadable characters between the letters.

    For a working example, please refer to the following article:

    Strengthening the Protection of Text Watermark in Presentation Documents

    Using unreadable characters in the watermark text forbids its modification using Find and Replace dialog and therefore, it strengthens the protection of text watermark in presentation documents. GroupDocs.Watermark for Java 18.8 allows including unreadable characters in a text watermark to strengthen its protection.

    For a working example, please refer to the following article:

    Enhancements

    Support of SmartArt and CustomXml Drawing Types in Spreadsheet Documents

    The latest version of GroupDocs.Watermark also supports SmartArt and CustomXml drawing types for Excel documents. You can now find and remove SmartArt and CustomXml drawing types from the worksheets.

    For a working example, please refer to the following article:

    Bug Fixes

    GroupDocs.Watermark for Java 18.8 includes the fix for the following issue.

    • Locking watermark in PPTX, PPT is not working

    Breaking Change

    Since version 18.8, the security of Metered licensing has been improved. Metered licensing is now applicable only in Java runtime version 8u101 or above. Please use other types of licensing if you are using v18.8 or above in Java 7.

    Available Channels and Resources

    Feedback

    As always, if you have any questions or suggestions, feel free to write on our forum.

    Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn
    Posted in GroupDocs.Watermark Product Family | Tagged , , , , | Leave a comment