Azure Blob Streaming Configuration

This section provides information on how to configure Azure Blob connection with real-time events monitoring and data streaming. To enable Real Time Events Monitoring (Streaming) for an existing Azure Blob connection, complete the below mentioned steps.

Prerequisites

Existing Azure Blob connection: An Azure Blob scan configuration must already exist. If you have not created an Azure Blob scan yet, follow steps in the section Azure Blob to set up a scan and ensure the necessary credentials are set up.

  1. Select an Existing Scan Configuration
    1. Navigate to the Scan configurations page.
    2. Find the existing Azure Blob scan configuration and select Edit Configuration from the options menu.

  2. Enable Data Streaming
    1. Within the Edit Azure Blob Scan Configuration page, check Subscribe to events streaming (DDR).
    2. Copy the Webhook URL provided, as you will use it later in the Azure Portal.

  3. Configure Azure Event Grid Subscription
    1. Navigate to Azure Portal and open your Storage Account.

    2. Select one of the connectors from the Storage Accounts.

    3. In the left-hand menu, select Events and click Create Event Subscription from the menu.

    4. In Create Event Subscription Window fill in the details:

      1. Give it a Name.
      2. Select endpoint type Web Hook.
      3. Set configures an endpoint.
        <figure><img src="../../.gitbook/assets/cab519c5-725f-4f62-a8d4-3bce7eb60737 (1).png" alt=""><figcaption></figcaption></figure>
      4. Use the Webhook URL provided in the step 2 to Subscriber endpoint and Confirm the selection.
    5. Go to Filters Menu on top.

    6. In the Subject Filters section, enter the correct path format for your subscription:
      • Use the following pattern: /blobServices/default/containers/{connectionDetails.ContainerName}/blobs/{connectionDetails.FolderPath}
      • For example, if the container is mycontainer and the folder path is accuracy test/repository1, the path will look like:
        /blobServices/default/containers/mycontainer/blobs/accuracy
            test/repository1

        Make sure to replace {connectionDetails.ContainerName} and {connectionDetails.FolderPath} with the actual container name and folder path from your scan configuration.

    7. Click Create to complete the Event Subscription setup.
  4. Assign Required Azure Permissions
    Ensure the following permissions are assigned to the Azure Storage Account:
    • EventGrid Data Contributor
    • EventGrid EventSubscription Contributor
    • EventGrid TopicSpaces Publisher

    For details on assigning these roles, refer to Azure Blob section.

  5. Create Azure Event Hub
    1. Navigate to Azure Portal Event hubs and click Create.

    2. In Create Namespace Window fill in the details:
      • Give it a Name
      • Select your subscription and resource group
      • Select location
      • Pricing tier - standard
      • Throughput Units - 1

    3. Click on Review + Create and then Create after validation.

    4. After namespace is created, click on + Event Hub button.

    5. In Create Event Hub Window fill in name and click Create + Review and Create after validation. Save the name of the Event Hub you created in this step, as it will be used later in step 9 to replace {eventHubName}.

    6. Configure access policy:
      1. In the event hubs namespace window click on Settings/Shared access policies and then +Add button.

      2. Fill in the details in the new tab, set LogicAppsListenerPolicy as name, select Listen policy, and click Save.
      3. Click on the newly created policy, then copy and save the Connection string–primary key. This will be needed later in step 8b.
  6. Configure Azure Storage Diagnostic settings:
    1. Navigate to Azure Portal and open your Storage Account.

    2. Select needed account from the Storage Accounts.

    3. In the left-hand menu, select Monitoring/Diagnostic settings and click blob.

    4. In Diagnostic settings Window click on + Add diagnostic setting button.

  7. In Create Diagnostic setting Window fill in the details:
    1. Give it a Name
    2. Select Category groups allLogs
    3. Select Destination details Stream to an event hub and select newly created Event Hub Namespace and Event Hub.

    4. Click Save.
  8. Configure Azure Logic Apps
    1. Go to Azure logic apps and click Add button.

    2. In Create Logic App Window select Workflow Service Plan.
    3. In Create Logic App (Workflow Service Plan) Window fill in the details and click Review + create:
      1. Select your subscription and resource group
      2. Give logic app name
      3. Select region
      4. Pricing plan should be WS1
      5. In the monitoring tab select No for the application insights

      6. Click Review + create button.
    4. Click Create after validation.
    5. In newly created logic app click on Workflows/Workflows and then +Add button.
    6. In new workflow tab fill in name, select State type: Stateful and click Create.

    7. In created workflow go to Developer/Designer and click on Add a trigger, then in search type Event hub and select When events are available in Event Hub.

    8. Configure API connection
      1. Click on the trigger, set Temp for Event Hub Name and then click on Change connection.

      2. Then click Add New and fill in the details. Enter any name for the connection name and use the connection string {Connection string–primary key} from step 3.6.c.
      3. On the Change Connection tab, click Details and copy the Name from the connection details. Save this Name, as it will be used later in step 9 to replace {connectionName}.

      4. Click Save on workflow designer window

    9. In work flow navigation tab go to Developer/Code and set the provided code, then click Save:
      1. Replace {FolderPath} with a path to the streaming folder. For ex., you want to get events from the folder "StreamingFolder" which is located in file share with the name "DocumentsShare" and in the folder with the name "Personal". In this case, the path should be "DocumentsShare/Personal/StreamingFolder".
      2. Replace {WebhookUrl} with webhook url provided in the application in the scan configuration window.
      3. Replace {eventHubName} with azure event hub name that was created previously.
      4. Replace {connectionName} with connection name from previous step.
        {
            "definition": {
                "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
                "actions": {
                    "Filter_Records": {
                        "type": "Query",
                        "inputs": {
                            "from": "@triggerBody()?['ContentData']?['records']",
                            "where": "@and(not(empty(item()?['uri'])),or(contains(item()?['uri'], '{FolderPath}/'),contains(item()?['uri'], '{FolderPath}?')))"
                        },
                        "runAfter": {}
                    },
                    "Condition": {
                        "type": "If",
                        "expression": "@greater(length(body('Filter_Records')), 0)",
                        "actions": {
                            "HTTP-copy": {
                                "type": "Http",
                                "inputs": {
                                    "uri": "{WebhookUrl}",
                                    "method": "POST",
                                    "headers": {
                                        "Content-Type": "application/json"
                                    },
                                    "body": {
                                        "event": "@setProperty(triggerBody(),'ContentData',setProperty(triggerBody()?['ContentData'],'records',body('Filter_Records')))"
                                    }
                                },
                                "runAfter": {}
                            }
                        },
                        "else": {},
                        "runAfter": {
                            "Filter_Records": [
                                "Succeeded"
                            ]
                        }
                    }
                },
                "contentVersion": "1.0.0.0",
                "outputs": {},
                "triggers": {
                    "When_events_are_available_in_Event_Hub": {
                        "type": "ApiConnection",
                        "inputs": {
                            "host": {
                                "connection": {
                                    "referenceName": "{connectionName}"
                                }
                            },
                            "method": "get",
                            "path": "/@{encodeURIComponent('{eventHubName}')}/events/batch/head",
                            "queries": {
                                "contentType": "application/json",
                                "consumerGroupName": "$Default",
                                "maximumEventsCount": 50
                            }
                        },
                        "recurrence": {
                            "interval": 30,
                            "frequency": "Second"
                        },
                        "splitOn": "@triggerBody()"
                    }
                }
            },
            "kind": "Stateful"
        }
        

Troubleshooting

If you experience any issues with the configuration, ensure that:

  1. The Webhook URL is correct and matches the configuration in Azure.
  2. The required Azure permissions are correctly assigned.
  3. Steps 5.8 and 5.9 properly executed and all the variables are replaced with real values.
  4. You can also check if the trigger was unsuccessful by navigating to your configured in previous steps Logic App, then work flow and Trigger History. If you see any failed triggers, you can inspect the error details to identify the issue.

Next Steps

After configuring the event subscription:

  • Documents may be uploaded to the configured path.
  • The events triggered by these uploads will be processed by the Data Streaming setup, and the results will appear in the dashboard.