Detectors

Detectors allow users to match and categorize files during scans based on regular expressions or keywords (positive or negative terms). It uses advanced AI and ML search techniques, such as Fuzzy Word Search and Percolation, to search through documents much more quickly than a traditional pattern-matching search.

3 types of detectors are available:

  • Path: Search within the file path (location, name, and extension).
  • Content: Search within contents of files.
  • Attribute: Define criteria for "critical" and "sensitive" based on attributes in scan results such as ML output or path and content detector hits.

An example of a Detector that a user could set up is “Employee Salary”. A user might want to ensure that documents that contain this information are not publicly shared or shared internally throughout an organization.

Note: Pattern matching and Detectors use different regex syntax. Pattern matching uses Golang whereas Detectors use the Apache Lucene format.
Note: Each token that is added to a detector is related to the other tokens like an OR condition. AND conditions are not available in detectors, but this functionality can be configured indirectly through the data asset registry or directly through RegEx pattern matching.

Importing Detectors

Note: Detectors are installed by default through the Quick Start page so importing them is normally not required.
  1. Navigate to Administration > Detectors.
  2. Click Import from file.

  3. Select the detectors file to import. A new set of detectors will be imported into your environment.

Similarly, you can click the EXPORT button to export a list of detectors.