Avoid fingerprinting short values

Fingerprinting columns with short field values can lead to multiple false-positive incidents.

For numeric fields, we recommend that you fingerprint values with 5 digits and higher (>=10000) because:

  • 4 digits easily match years (frequently appearing in email)
  • 3 digits are quite common
  • 1 and 2 digits numbers match days of month

The validation script template is a script that removes numbers with values less than the configured minimum (see Patterns & Phrases section, for more details).

Note: If you must fingerprint a numeric column and removing numbers is not an option, please make sure that this column is always combined with another in the policy rule. For example, if it is an account number field, combine it with the Name, Address, or SSN of the person owning the account.

For non-numeric fields, we recommend that you fingerprint values with 4 or more characters. The reasoning is that:

  • 3 letters are commonly used in abbreviations (TLA - Three Letters Abbreviation)
  • 2 letters match U.S. states, country codes, etc.
  • 1 letter has no real meaning

The validation script template removes non-numeric fields shorter than the configured length in characters.

Note: If you must fingerprint a non-numeric column and removing values is not an option, please make sure that this column is always combined with another in the policy rule. For example, if it is last name field, combine it with the first name, address or SSN of the person owning the account. Regardless, do NOT fingerprint fields shorter than 3 characters.