Setting up the Databricks Workspace Datasource Connector for Unstructured Data

Prerequisites

  • The datasource must be reachable from the network where the product is deployed (for example, via a private DNS name or a routable IP address).
  • If a firewall is in place, ensure the required port is open.
  • Create a user account with read access to the datasource metadata for the product to run the scanning process.
Creating Database Credentials
  1. Navigate to Administration > Data Sources and choose Databricks Workspace from the DATABRICKS provider.

  2. Click NEW CREDENTIALS to create a new credential for Databricks Workspace.

  3. Provide Credentials name.

  4. Provide Host information. To find the Host [Databricks workspace URL], follow the below mentioned steps.
    1. Log in to your Databricks account and from the left navigation menu, select Workspaces and Click on the workspace you want to connect.

    2. In the Configuration tab, locate the URL field and copy the complete workspace (Host) URL.

    3. Paste the copied workspace (Host) URL in the Host field.

  5. Provide Access Token. To generate the Access Token follow the below mentioned steps.
    1. In the Databricks workspace, click your user profile icon in the top-right corner and go to settings.

    2. Under User settings, click Developer and click Manage to open the Access tokens screen.

    3. Click Generate new token.

      1. Enter a comment to identify the token.
      2. Specify the token Lifetime (days) as required.
      3. Under Scope, select Other APIs.
      4. Under API scope(s), add the following scopes: files, scim, access-management, and workspace.
      5. Click Generate.

    4. Copy the generated token immediately.
      Note:

      The token will not be visible again after this step.

  6. Paste the copied token into Access Token field and click SAVE & CLOSE.