A taxonomy or classification is basically an approach in which text is systematically identified and then organized. When you are dealing with a bunch of data (text based or documents), it becomes hard to find a topic of your need until and unless this data is classified or organized. Hence, you have to classify text in order to fetch data/information quickly.
GroupDocs.Classification for .NET
GroupDocs offers a programmable document or text classification API for .NET developers. You just have to add a single DLL (GroupDocs.Classification for .NET) as a reference in your .NET project. API allows developers to use two different taxonomies: IAB-2 (Interactive Advertising Bureau) and documents taxonomy.
IAB-2 text classification
IAB-2 categories texts into multiple topics and then identifies text based on the depth level. Call Classify method with a text as parameter to perform classification.
This text will be classified as Healthy_Living (IAB-2). Some more examples:
- Sooner or later technology will overcome labor work – Technology_&_Computing (IAB-2)
- This game has better graphics on Xiaomi Note 8 pro mobile – Video_Gaming (IAB-2)
- We need groceries for the next month – Shopping (IAB-2)
Documents taxonomy is used to identify different document classes, such as Invoices, CVs, Forms, emails. Call Classify method for “document.pdf” file in the current directory with IAB-2 taxonomy and return 2 best results.
Call Classify method for “document.doc” file with Documents taxonomy, set precision/recall balance to “Precision” and return 4 best results.
API also facilitates classification of password-protected documents.
Below are some helpful resource for you
We’d recommend you to explore these resources, evaluate API and if there is any issue, you can raise it on forum.