Pattern classifiers

This section lists the predefined pattern classifiers. Administrators can also create new classifiers.

For information, see Adding or editing a regular expression classifier.

Classifier Description
.htpasswd File Name Detection of file names associated with .htpasswd files.
.REG File Detection of the content of.REG files (Windows Registry entries).
10 Digit Account Number with support Detection of any 10 digit number in proximity to an account number support term (can be used for various account types as long as they are 10 digits).
5-8 Digit Account Number with support Detection of any 5-8 digit number in proximity to an account number support term (can be used for various account types as long as they are 5-8 digits).
5-9 Digit Account Number Detection of any 5-9 digit account numbers.
9 Digit Account Number with support Detection of any 9 digit number in proximity to an account number support term (can be used for various account types as long as they are 9 digits).
Account 5 to 8 digits Detection of 5-8 digit account numbers.
Classifier Description
Account and Password Detection of a 5-10 digit account number, in proximity to a password with a password related term next to it.
Account Number 5-9 digits, with Hebrew or English Support Detection of any 5-9 digit account numbers, when found in proximity to account related terms in English or Hebrew.
Account Number 6-13 digits Detection of 6-13 digit account numbers.
Account Number 6-13 digits near Account Number Terms in Hebrew and English Detection of 6-13 digit account numbers, in proximity to account related terms in English or Hebrew.
Account Number Terms Hebrew and English Support Detection of account terms in English or Hebrew.
Argentina Swift codes Detection of SWIFT codes for Argentina major banks.
Australia Swift codes Detection of SWIFT codes for Australia major banks.
Australian Bank Account Numbers

Detects Australian bank account numbers.

Looks for 6- to 10-digit numbers. For example: 1234567

Australian Bank Account support terms Detects Australian bank account support terms. For example: Acc. no., account number
Austria Swift codes Detection of SWIFT codes for Austria major banks.
Bahrain Swift codes Detection of SWIFT codes for Bahrain major banks.
Base64/Hexadecimal Characters Block Detection of a block of Base64 or Hexadecimal Characters.
Belgium Swift codes Detection of SWIFT codes for Belgium major banks.
Belgium: Passports Detection of Belgium passport numbers
Brazil Swift codes Detection of SWIFT codes for Brazil major banks.
Brazil: RG Numbers Terms Detection of RG (Registro Geral) related terms.
C++ Source Code Extensions Detection of file extensions associated with C++ source code files.
CAD igs text format Detection of CAD igs text files
Canada Swift codes Detection of SWIFT codes for Canada major banks.
Canadian Driver License Support Detection of Canadian driver license support terms.
Canadian Government ID Detection of Canadian Government IDs.
Canadian Indian Status Detection of Canadian Indian Status Numbers.
Canadian Permanent Resident Detection of Canadian Permanent Resident Numbers.
CCN support terms Detection of credit card support terms.
Chile Swift codes Detection of SWIFT codes for Chile major banks.
China Swift codes Detection of SWIFT codes for China major banks.
Clinical Trial Numbers Detection of numbers likely to appear in ‘Clinical Trial’ documents.
Classifier Description
Colombia Swift codes Detection of SWIFT codes for Colombia major banks.
Confidential Arabic Terms in Header/Footer Detection of documents with terms in English or Arabic indicating confidentiality in the header or footer.
Confidential Header/Footer Detection of documents with terms indicating confidentiality in the header or footer.
CUI Banner Marking (Wide) Detection of Controlled Unclassified Information (CUI) banner markings, including the case-insensitive phrases "CONTROLLED" or "CUI". For example: “Controlled".
CUI Banner Marking (Default) Detection of Controlled Unclassified Information (CUI) banner markings. The classifier will be triggered by the case-insensitive terms “CONTROLLED” or “CUI” followed by category markings and/or limited dissemination control markings. For example: “CONTROLLED//SP-ADJ//FEDCON”.
CUI Banner Marking in Header/Footer (Wide)

Detection of Controlled Unclassified Information (CUI) banner markings in the header or footer part of a file, including the case-insensitive phrases “CONTROLLED” or “CUI”. For example: “Controlled”. Header and footer are extracted as such for some OpenDocument and Microsoft Office file formats (.doc, .docx, .odp, .ods, .odt,

.pptx, .xls, .xlsx).

CUI Banner Marking in Header/Footer (Default) Detection of Controlled Unclassified Information (CUI) banner markings in the header or footer part of a file. The classifier will be triggered by the case-sensitive terms "CONTROLLED" or "CUI" or by the same case- insensitive terms, followed by category markings and/or limited dissemination control markings. For example: "CONTROLLED". Header and footer are extracted as such for some OpenDocument and Microsoft Office file formats (.doc, .docx, .odp, .ods, .odt, .pptx, .xls, .xlsx).
CUI Banner Marking in Header/Footer (Narrow)

Detection of Controlled Unclassified Information (CUI) banner markings in the header or footer part of a file. The classifier will be triggered by the case-insensitive terms "CONTROLLED" or "CUI", followed by category markings and/or limited dissemination control markings. For example: "CONTROLLED//SP-ADJ//FEDCON".

Header and footer are extracted as such for some OpenDocument and Microsoft Office file formats (.doc,

.docx, .odp, .ods, .odt, .pptx, .xls, .xlsx).

CUI Designation Indicator (Wide) Detection of Controlled Unclassified Information (CUI) designation indicators, may or may not include parameters after each field. For example: "Controlled by:CUI Category(ies): Limited Dissemination Control:POC:".
CUI Designation Indicator (Default) Detection of Controlled Unclassified Information (CUI) designation indicators, including parameters for all fields. For example: "Controlled by: CL&S INFOSECCUI Category(ies): NNPILimited Dissemination Control: NOFORNPOC: John Brown, 703-555-0123".
Classifier Description
CUI Portion Marking (Wide) Detection of Controlled Unclassified Information (CUI) portion markings, including the case-insensitive string "(CUI)". For example: "(cui)".
CUI Portion Marking (Default) Detection of Controlled Unclassified Information (CUI) portion markings. The classifier will be triggered by parentheses containing the case-sensitive string "CUI" or by the case-insensitive string, followed by category markings and/or limited dissemination control markings. For example: "(CUI)".
CUI Portion Marking (Narrow) Detection of Controlled Unclassified Information (CUI) portion markings. The classifier will be triggered by parentheses containing the case-insensitive term "CUI", followed by category markings and/or limited dissemination control markings. For example: "(CUI//SP- ADJ//FEDCON)".
Cyber Bullying Detection of expressions that are indicative of cyber bullying.
Cypriot Tax Identification Code Detects Cypriot tax identification codes. For example: “12000017M”.
Czech Republic Swift codes Detection of SWIFT codes for Czech Republic major banks.
Date Of Birth without support term Detection of possible dates of birth, without support terms.
Deep Web URLs: .i2p (Wide) Detects URLs that appear in analyzed content such as textual documents or email messages and end with the .i2p pseudo-top-level domain. For example: The string “forum.i2p”.
Deep Web URLs: .i2p (Default) Detects URLs that appear in analyzed content such as textual documents or email messages, begin with “http/s” and end with the .i2p pseudo-top-level domain. For example: The string “http://hosts.i2p”.
Deep Web URLs: .onion

Detects URLs that appear in analyzed content such as textual documents or email messages and end with the

.onion pseudo-top-level domain designating an anonymous hidden service reachable via the Tor network. For example: The string “i4rx33ibdndtqayr.onion”.

Denmark Swift codes Detection of SWIFT codes for Denmark major banks.
Disgruntled Employee Detects expressions that are indicative of disgruntled employees. For example: “I hate my boss”, “I am miserable at my job”.
Driver License Support Detection of driver license support terms.
Driver License: Alaska Detection of Alaska driver license.
Driver License: Alberta Detection of Alberta driver license.
Driver License: Arizona Detection of Arizona driver license.
Driver License: Arkansas Detection of Arkansas driver license.
Classifier Description
Driver License: Australia Detection of Australian driver license.
Driver License: British Columbia Detection of British Columbia driver license.
Driver License: California Detection of California driver license.
Driver License: Canada all patterns Detection of various Canadian driver license formats.
Driver License: Colorado Detection of Colorado driver license.
Driver License: Connecticut Detection of Connecticut driver license.
Driver License: Florida Detection of Florida driver license.
Driver License: Georgia Detection of Georgia driver license.
Driver License: Hawaii Detection of Hawaii driver license number.
Driver License: Idaho Detection of Idaho driver license.
Driver License: Illinois Detection of Illinois driver license.
Driver License: Kansas Detection of Kansas driver license number.
Driver License: Louisiana Detection of Louisiana driver license.
Driver License: Maine Detection of Maine driver license.
Driver License: Manitoba Detection of Manitoba driver license.
Driver License: Maryland Detection of Maryland driver license.
Driver License: Michigan Detection of Michigan driver license.
Driver License: Minnesota Detection of Minnesota driver license.
Driver License: Montana Detection of Montana driver license.
Driver License: New Brunswick Detection of New Brunswick driver license.
Driver License: New Hampshire Detection of New Hampshire driver license.
Driver License: New Jersey Detection of New Jersey driver license.
Driver License: New York Detection of New York driver license.
Driver License: Newfoundland and Labrador Detection of Newfoundland and Labrador driver license.
Driver License: North Carolina Detection of North Carolina driver license.
Driver License: Nova Scotia Detection of Nova Scotia driver license.
Driver License: Ohio Detection of Ohio driver license.
Driver License: Oklahoma Detection of Oklahoma driver license.
Driver License: Ontario Detection of Ontario driver license.
Driver License: Oregon Detection of Oregon driver license.
Driver License: Pennsylvania Detection of Pennsylvania driver license.
Classifier Description
Driver License: Prince Edward Island Detection of Prince Edward Island driver license.
Driver License: Quebec Detection of Quebec driver license.
Driver License: Rhode Island Detection of Rhode Island driver license.
Driver License: Saskatchewan Detection of Saskatchewan driver license.
Driver License: Tennessee Detection of Tennessee driver license.
Driver License: Texas Detection of Texas driver license.
Driver License: US all patterns Detection of various US driver license formats.
Driver License: US all patterns with Support Detection of various US driver license formats, with support term in proximity.
Driver License: Washington Detection of Washington driver license.
Driver License: Wisconsin Detection of Wisconsin driver license.
EIN pattern Detection of Employer ID Numbers (EIN).
Energy File Extensions Detection of files containing petrophysical data.
Energy Logs and Survey Reports Detection of terms related to Prospecting Logs and Survey Reports.
England Swift codes Detection of SWIFT codes for England major banks.
Estonia Swift codes Detection of SWIFT codes for Estonia major banks.
EU National Insurance Number Detection of various European national insurance number formats (UK NINO, French INSEE, Spanish DNI, Italian Codice Fiscale).
F# Source Code Extensions Detection of F# files according to their extension.
Finland Swift codes Detection of SWIFT codes for Finland major banks.
Form 10-Q Phrases Detection of Form 10-Q phrases.
Form W-2 Header Detection of terms taken from the Form W-2 Header (e.g., “FORM W 2” or “Form W-2”).
France Swift codes Detection of SWIFT codes for France major banks.
German ID Number (Wide) Detects German ID numbers. German ID number which consist of 9 characters where both letters and digits are present. For example: "L9V3K744K".
Germany Swift codes Detection of SWIFT codes for Germany major banks.
Greece Swift codes Detection of SWIFT codes for Greece major banks.
Hong Kong: Address in Chinese (Wide) Permissive detection of Hong Kong address in Chinese.
Hong Kong: Address in Chinese (Default) Detection of Hong Kong address in Chinese.
Hong Kong: Address in English (Wide) Permissive detection of Hong Kong address in English. For example: "5 Edinburgh Place, Central".
Classifier Description
Hong Kong: Address in English (Default) Detection of Hong Kong address in English. For example: "5 Edinburgh Place, Central District".
Hong Kong: Address in English (Narrow) Restrictive detection of Hong Kong address in English. For example: "5 Edinburgh Place, Hong Kong Island, Hong Kong".
Hong Kong Swift codes Detects SWIFT codes for Hong Kong major banks.
Hungary Swift codes Detection of SWIFT codes for Hungary major banks.
ICD9 Codes Detection of ICD9 Codes
Identification Number with proximity Detection of Identification numbers, when found in proximity to identification related terms.
IL buy or sell instructions Detection of buy and sell instructions in Hebrew.
IL buy or sell instructions support Detection of buy and sell support instructions in Hebrew.
IL Insurance Policy: 10 digits Detection of 10 digit policy numbers.

IL Insurance Policy: 10 digits

- support

Detection of 10 digits Israeli Insurance Number with terms in proximity.
IL Insurance Policy: 8 digits Detection of 8 digit policy numbers.
IL Insurance: Claim support Detection of terms to support identification of an Israeli Insurance Claim Number.
IL Insurance: Generic with proximity Detection of a generic Israeli Insurance Number with terms in proximity.
IL Life Insurance support Detection of insurance terms in Hebrew.
Illinois: State ID Detection of Illinois state ID.
India Swift codes Detection of SWIFT codes for India major banks.
India: Form 16 Headings Detection of India Form 16 headings.
India: PAN Detection of Indian PAN number.
Indonesia Swift codes Detection of SWIFT codes for Indonesia major banks.
Indonesian Single Identity Numbers (Wide) Detects valid 16-digit delimited or un-delimited Indonesian Single Identity Numbers (Nomor Induk Kependudukan) without limitations on the first 2 digits (Province code). For example “3313034604790001”.
IP Address Detection of an IP Address.
IP Address - Narrow Detection of an IP address, when found in proximity to IP related term such as “IP” or “subnet”.
IP Address - Wide Detection of all possible forms of IP addresses.
Ireland Account Pattern Detection of Irish bank account numbers.
Ireland Account Terms Detection of Ireland Bank Account Terms.
Ireland Swift codes Detection of SWIFT codes for Ireland major banks.
Irish Drivers License Detection of Irish driver’s license.
Classifier Description
Irish Passport Detection of Irish Passport numbers.
Irish PPS Terms Detection of terms related to Irish PPS (Personal Public Service) number.
Israel Swift codes Detection of SWIFT codes for Israel major banks.
Italian Phone Number (Wide) Detects 9-11 digits Italian telephone numbers (Landline and Mobile). For example, "+39-06-555-5555".
Italy Swift codes Detection of SWIFT codes for Italy major banks.
Japan Swift codes Detection of SWIFT codes for Japan major banks.
Japan: Account Detection of a Japanese account number.
Japan: Account 1st Format Detection of a Japanese account number.
Java Source Code Extensions Detection of file extensions associated with Java source code files.
Korea Republic Swift codes Detection of SWIFT codes for Korea Republic major banks.
Kotlin Source Code Extensions Detection of file extensions associated with Kotlin source code files.
Latvia Swift codes Detection of SWIFT codes for Latvia major banks.
Lithuania Swift codes Detection of SWIFT codes for Lithuania major banks.
Luxembourg Swift codes Detection of SWIFT codes for Luxembourg major banks.
Macau ID - formal Detection of Macau ID number (formal form).
Macau ID - non formal Detection of Macau ID number (non formal form).
Malaysian Swift codes Detection of SWIFT codes for Malaysia major banks.
Maltese Identity Card Number Detects Maltese identity card numbers. For example: “19999981M”.
Manuscript Terms 1 Detection of manuscript patterns.
Manuscript Terms 2 Detection of manuscript related terms.
Manuscript Terms 3 Detection of manuscript terms that support detection of manuscripts.
Mexico CPISP (Clave Personal Interna del Servidor Publico) Detection of Mexico CPISP.
Mexico Swift codes Detection of SWIFT codes for Mexico major banks.
Microsoft License Keys Detection of Microsoft license keys.
MySQL-Format Database Dump (Wide) Detection of textual MySQL-format database dumps using lenient heuristics. For example: CREATE TABLE ‘users’ (‘username’ varchar(16) NOT NULL,’password’ varchar(16) NOT NULL, PRIMARY KEY (‘id’) ); INSERT INTO ‘users’ VALUES (‘John’,’QWERTY12’),(‘Jane’,’QWERTY13’);.
Classifier Description
MySQL-Format Database Dump (Default) Detection of textual MySQL-format database dumps using strict heuristics. For example: CREATE TABLE ‘users’ (‘username’ varchar(16) NOT NULL, ‘password’ varchar(16) NOT NULL, PRIMARY KEY (‘id’) ); INSERT INTO ‘users’ VALUES (‘John’,’QWERTY12’),(‘Jane’,’QWERTY13’);.
Netherlands Swift codes Detection of SWIFT codes for Netherlands major banks.
Netherlands Passport Numbers Detection of Passport Numbers of the Netherlands.
Netherlands: Bank Account Terms Detection of Dutch Bank Account related terms.
Network Terms Detection of network related terms.
Network Terms and IP Addresses Detection of network related terms and IP addresses.
New Zealand Swift codes Detection of SWIFT codes for New Zealand major banks.
Norway Swift codes Detection of SWIFT codes for Norway major banks.
Password as URL parameter Detection of password as URL parameter.
Passwords pattern Detection of common passwords, maximum of 300 Passwords.
Pattern - 10 digits non delimited Pattern - 10 digits non-delimited.
Perl Source Code Extensions Detection of Perl files according to their extension.
Peru Swift codes Detection of SWIFT codes for Peru major banks.
Philippines Swift codes Detection of SWIFT codes for Philippines major banks.
Physical Information - Blood Type Detection of Private Physical Information - Blood Type (English/Hindi).
Physical Information - Build Detection of Private Physical Information - Build (English/ Hindi).
Physical Information - Complexion Detection of Private Physical Information - Complexion (English/Hindi).
Physical Information - Eye Color Detection of Private Physical Information - Eye Color (English/Hindi).
Physical Information - Hair Color Detection of Private Physical Information - Hair Color (English/Hindi).
Physical Information - Height Detection of Private Physical Information - Height (English/Hindi).
Physical Information - Sex Detection of Private Physical Information - Sex (English/ Hindi).
Physical Information - Weight Detection of Private Physical Information - Weight (English/Hindi).
Poland Swift codes Detection of SWIFT codes for Poland major banks.
Polish ID support terms Detection of terms related to Polish ID number.
Classifier Description
Polish Name Detection of a Polish name.
Polish NIP support terms Detection of terms related to Polish NIP number (a number used for tax identification).
Portugal Swift codes Detection of SWIFT codes for Portugal major banks.
Prices with Currencies Detection of a price in various currencies.
Problem Gambling Detection of expressions that are indicative of problem gambling. For example: “I am addicted to gambling”, “My gambling is out of control”.
Proprietary Header/Footer Detection of documents with terms with proprietary data in the header or footer.
Protein pattern Detection of Protein patterns.
Python Source Code Extensions Detection of file extensions associated with Python source code files.
Romania Swift codes Detection of SWIFT codes for Romania major banks.
Russia Swift codes Detection of SWIFT codes for Russia major banks.
Russian Passport - no terms Detection of a Russian passport ignoring terms
Russian Passport - significant Detection of a Russian passport with a passport term in proximity
Russian Passport Filter for Spreadsheet Files Detection of a salient part of a Russian passport
Russian phone numbers pattern (optional delimiters) - wide Detection of Russian phone numbers with optional delimiters (including period).
Russian phone numbers pattern (with delimiters) Detection of delimited Russian phone numbers.
Saudi Arabia Swift codes Detection of SWIFT codes for Saudi Arabia major banks.
Security Accounts Manager (SAM) Files (Registry) Detection of SAM textual files as they appear in the Windows registry.
Singapore Swift codes Detection of SWIFT codes for Singapore major banks.
Slovak ID Number (Wide) Detects Slovak ID numbers. Slovak ID number which consist of 2 letters and 6 digits. For example: "EN470543".
Slovakia Swift codes Detection of SWIFT codes for Slovakia major banks.
Slovenia Swift codes Detection of SWIFT codes for Slovenia major banks.
Social Security Numbers Pattern Detection of Social Security Numbers
Social Security Numbers Pattern (with prefixes) Detection of Social Security Numbers
Social Security Numbers Terms Detection of Social Security Numbers support terms.
Source Code Extensions Detection of C and Java files according to their extension.
Classifier Description
South Africa Swift codes Detection of SWIFT codes for South African major banks.
SPICE Source Code - Constant Declaration Detection of constants declaration in the SPICE programming language.
SPICE Source Code - Simulator Language Declaration Detection of a SPICE simulator language declaration.
SPICE Source Code - Sub- Circuit Declaration Detection of a Sub-Circuit declaration in the SPICE programming language.
SPICE Source Code - Various Key Words 1 Detection of various keywords in the SPICE programming language.
SPICE Source Code - Various Key Words 2 Detection of various keywords in the SPICE programming language.
SPICE Source Code - Various Key Words 3 Detection of various keywords in the SPICE programming language.
SSN or TIN in an IRS Form Detection of a Social Security Number or a Taxpayer Identification Number near a related term in a format common to IRS forms.
Suicidal thoughts (Default) Detection of expressions that are indicative of suicidal thoughts.
Sweden Swift codes Detection of SWIFT codes for Sweden major banks.
Swift Source Code Extensions Detection of file extensions associated with Swift source code files.
Swiss AHV Number (New Format) Detection of a Swiss AHV (Swiss Social Security) number in its new format (introduced at July 1st, 2008).
Swiss AHV Number (Old Format) Detection of a Swiss AHV (Swiss Social Security) number in its old format.
Switzerland Swift codes Detection of SWIFT codes for Switzerland major banks.
Taiwan Swift codes Detection of SWIFT codes for Taiwan major banks.
Thailand Swift codes Detection of SWIFT codes for Thailand major banks.
Turkey Swift codes Detection of SWIFT codes for Turkey major banks.
UK Bank Account Numbers Detection of United Kingdom bank account numbers.
UK Bank Account support terms Detection of UK bank account support terms.
UK Bank Sort Codes Detection of United Kingdom sort codes. This classifier may cause false positives.
UK National Insurance Number Detection of a UK national insurance number (NINO).
UK National Insurance Number - no proximity Detection of a UK national insurance number (NINO) without terms in proximity.
UK Passport number Detection of a UK passport number.
Classifier Description
UK Postal Codes Detection of the postal codes used in the United Kingdom according to the BS 7666 postal code format rules.
UK Tax ID Detection of a UK tax ID number.
Ukrainian ID Number (Wide) Detects Ukrainian ID numbers. Ukrainian ID number which consist of 2 letters and 6 digits. For example: "KM456986".
United States Swift codes Detection of SWIFT codes for United States major banks.
US Grades Detection of grades in proximity to an academic subject.
US ITIN Detection of Individual Taxpayer Identification Number (ITIN).
UTM distances Detection of numbers representing distance in meters as used in the UTM coordinate system.
Verilog Source Code - Entire Module Declaration Detection of Verilog source code - looking for an entire Verilog module declaration.
Verilog Source Code - Module Header Declaration Detection of Verilog source code - looking for Verilog module declaration (header only).
VHDL Source Code - Declaration Footer Detection of VHDL source code - looking for a terminating declaration of Architecture, Component, Process or Entity.
VHDL Source Code - Use Statement Detection of VHDL source code - looking for a use statement declaration.
Vietnam CMND Number Detects valid 9-digit delimited or un-delimited Vietnamese CMND numbers. For example: 331-147-981.
W-2 Form support terms 2 Detection of terms taken from the W-2 Form Header (like “FORM W 2” or “Form W-2”).
Year Period Detection of a period denoted by starting year and ending year (e.g,. 1999-2002).
Zip Plus 4 Detection of Zip codes.