Comparison with other types of classifiers

The following table summarizes the advantages and disadvantages of the various classifier types:

  Machine Learning Fingerprint- ing Pre-Defined Policies User-Defined Dictionaries and Regular Expressions
Coverage High: Covers any document with semantic similarities to the learned data Medium: Detects only derivatives of fingerprinted documents Limited to the existing pre- defined types Unlimited, providing that the user has properly defined the dictionaries and the regular expressions
Accuracy Depends on the data Very High High for data types that are common enough Medium
“Zero-Day” Protection High Very Low High High
Size/Footprint Medium High Low Low
Deployment and Config Effort Medium (may require some tuning) Medium Low High - requires careful setting and tuning
For more information on how to use machine learning, see: