Configuring content categorization

When a web page is requested, content categorization is performed if:

  • The URL has not already been blocked by the active policy
  • The URL is not in the Forcepoint URL Database
  • The URL has an elevated risk profile, as identified by Forcepoint Security Labs

The category that is determined by content categorization is forwarded to Filtering Service for policy enforcement.

Content categorization can, optionally, include analysis of URL links embedded in the content. Such analysis can provide more accurate categorization of certain types of content. For example, a page that otherwise has little or no undesirable content, but that links to sites known to have undesirable content, can itself be more accurately categorized. Link analysis is particularly good at finding malicious links embedded in hidden parts of a page, and in detecting pages returned by image servers that link thumbnails to undesirable sites. For more information about how analysis of link neighborhoods can improve coverage, read the Forcepoint Security Labs blog post In Bad Company.

The effectiveness of content categorization and link analysis is quantified in several presentation reports. See Presentation reports, for more information.

Important: If you plan to generate reports of advanced analysis activity, enable full URL logging (see Configuring how URLs are logged). Otherwise, log records include only the domain (www.domain.com) of the site categorized, and individual pages within a site may fit into different categories.

If your site uses WebCatcher to report uncategorized URLs to Forcepoint LLC (see What is WebCatcher?), URLs categorized through content categorization are forwarded for inclusion in the Forcepoint URL Database.

To configure content categorization:

Steps

  1. Go to the Settings > Scanning > Scanning Options page.
  2. Select Off to disable content categorization.
  3. Select On (default) to enable content categorization.
  4. Select Analyze links embedded in Web content to include embedded link analysis in content analysis. Requests that are blocked as a result of link analysis are logged and can be viewed in Scanning Activity presentation reports.
  5. When you are finished, click OK to cache your changes. Changes are not implemented until you click Save and Deploy.

Next steps

The algorithms used to perform content categorization are tuned by Forcepoint Security Labs to provide the best results for most organizations. However, if the Optimized setting does not produce the results you expect, you can adjust the sensitivity level to influence more restrictive or more permissive results. See the Content Gateway advanced analysis optionssection of this screen.