2 min read

M-Files Smart Classifier - Overview

M-Files offers quite a few Intelligence Services to improve document tagging,  recognition, and creation. These services include the M-Files Smart Extractor, Smart Metadata, and today's topic - Smart Classifier. 

What is Smart Classifier?

The purpose of Smart Classifier is to equip your Document Management System with tools that automatically provide document class suggestions upon data ingestion. Each time a document is added to your vault, its content and structure are analyzed and compared to existing documents. M-Files then provides a suggestion for the classification of the new document based on data trends within your vault.

Each time a new document is classified, M-Files goes through an internal process of learning and validation, and over time improves its ability to intelligently analyze content and provide improved class suggestions. This internal process consists of a background operation that periodically samples a document learning set, using a naïve Bayes classifier to determine results.

Smart Classifier is not designed to extract data like an OCR Engine would. Instead, M-Files is looking at the overall characteristics of a document, such as headers, logos, page counts, pixel gradients, language family, vernacular, etc. This means that the documents that use this service must have the following attributes:

  • Consistent internal structure and vernacular
  • Noticeable differences between other intelligently analyzed classes

When these two attributes do not apply to the document classes with which you want to use Smart Classifier, M-Files may confuse the documents and suggest the incorrect class. In addition, for proper recognition and analysis of a document class, it’s important to prime your vault with samples of each document class you'd like to train - usually 50-75 documents for each class for use with Smart Classifier. 

When Is Smart Classifier Not Viable?

There are a few scenarios where Smart Classifier may not be the best option for use with particular classes, and will not provide optimal results:

  • The document layout and design vary drastically within the same class
  • The contents of two or more document classes are very similar to each other 
  • The language used within a document class differs (i.e English and French)

In these cases, Smart Classifier should not be enabled for document classes that fall under these categories and should be excluded from the document training set.

In addition, Smart Classifier often arises as a topic of conversation with clients when discussing initial document migration efforts. However, as mentioned previously, Smart Classifier requires that each class it identifies already have documents within the system, which are used to initially train the intelligence service. However, Smart Classifier can be used later on in your migration efforts once your vault is primed. 

Interested in learning more? Check out our other insights from TEAM IM, or reach out to us on our website at www.teamim.com

2 min read

Client Migration to Dropbox

TEAM IM recently aided a client in the time-sensitive process of migrating off of a legacy system and moving their content items into Dropbox. While...

Read More

M-Files Smart Migration - Overview

What is M-Files Smart Migration? M-Files Smart Migration is an application that uses other M-Files Intelligence Services to perform content migration...

Read More

M-Files Smart Metadata or Smart Extractor?

M-Files Smart Metadata vs Smart Extractor Smart Metadata and Smart Extractor are two intelligence services that M-Files provides, both of which can...

Read More