Classifying Content

Forcepoint DLP policies use content classifiers to describe the data that is being protected. Content can be classified according to file properties, key phrases, scripts, regular expression (regex) patterns, and dictionaries. Forcepoint DLP can also fingerprint data using, or administrators can provide examples of the type of data to protect so the system can learn from it and make decisions.

Use the Main > Policy Management > Content Classifiers page to start classifying data.

To start, select one of the listed content classifiers.

Classifier Description
Attributes
Patterns & Phrases Classify data using regex patterns, key phrases, dictionaries, and scripts. Regex patterns are used to identify alphanumeric strings of a certain format, such as 123-45-6789.
File Labeling Classify data by using the labeling system(s).
File properties Classify data by file name, type, or size. File name identifies files by their extension. File type identifies files by their magic number (an internal identifier).
Fingerprints
File fingerprinting Fingerprint files or directories, including Microsoft SharePoint and IBM Domino directories.
Database fingerprinting Fingerprint database records directly from your database table, Salesforce table, or CSV file.
Machine Learning
Machine learning Provide examples of the data to protect, so the system can learn from them and identify data of a similar nature.

Forcepoint provides predefined classifiers for the most common use cases. These are described in Predefined Policies and Classifiers.

To classify content, administrators can:

  • Select one of the predefined classifiers.
  • Customize a classifier as needed.
  • Create a new classifier from scratch.
Important: After classifying content, add the content classifier to a rule and policy; otherwise, it has no effect. You are prompted to do this when you create a new classifier.

The diagram below illustrates the granularity of each content classifier.

After classifying data, create a rule containing the content classifier and the conditions in which content should be considered a match. For example, if the content contains 3 keywords and an attachment over 2 MB, trigger an incident. In the rule, you define the sources and destinations to analyze.

Note that the system does not analyze all types of data. For example, it does not analyze the metadata of plain text files or the data inside Windows.cab files.

Before creating a database fingerprinting classifier, read Preparing for database fingerprinting section and Creating a validation script section.

Forcepoint DLP automatically runs validation scripts on your new database fingerprinting classifiers if the scripts are set up properly.