Comparison with other types of classifiers
The following table summarizes the advantages and disadvantages of the various classifier types:
Machine Learning | Fingerprint- ing | Pre-Defined Policies | User-Defined Dictionaries and Regular Expressions | |
Coverage | High: Covers any document with semantic similarities to the learned data | Medium: Detects only derivatives of fingerprinted documents | Limited to the existing pre- defined types | Unlimited, providing that the user has properly defined the dictionaries and the regular expressions |
Accuracy | Depends on the data | Very High | High for data types that are common enough | Medium |
“Zero-Day” Protection | High | Very Low | High | High |
Size/Footprint | Medium | High | Low | Low |
Deployment and Config Effort | Medium (may require some tuning) | Medium | Low | High - requires careful setting and tuning |
For more information on how to use machine learning, see: