A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-inventory-report-analytics below:

Tutorial: Analyze blob inventory reports - Azure Storage

By understanding how your blobs and containers are stored, organized, and used in production, you can better optimize the tradeoffs between cost and performance.

This tutorial shows you how to generate and visualize statistics such as data growth over time, data added over time, number of files modified, blob snapshot sizes, access patterns over each tier, and how data is distributed both currently and over time (For example: data across tiers, file types, in containers, and blob types).

In this tutorial, you learn how to:

Prerequisites Generate an inventory report

Enable blob inventory reports for your storage account. See Enable Azure Storage blob inventory reports.

You might have to wait up to 24 hours after enabling inventory reports for your first report to be generated.

Set up a Synapse workspace
  1. Create an Azure Synapse workspace. See Create an Azure Synapse workspace.

    Note

    As part of creating the workspace, you'll create a storage account that has a hierarchical namespace. Azure Synapse stores Spark tables and application logs to this account. Azure Synapse refers to this account as the primary storage account. To avoid confusion, this article uses the term inventory report account to refer to the account which contains inventory reports.

  2. In the Synapse workspace, assign the Contributor role to your user identity. See Azure RBAC: Owner role for the workspace.

  3. Give the Synapse workspace permission to access the inventory reports in your storage account by navigating to your inventory report account, and then assigning the Storage Blob Data Contributor role to the system managed identity of the workspace. See Assign Azure roles using the Azure portal.

  4. Navigate to primary storage account and assign the Blob Storage Contributor role to your user identity.

Set up Synapse Studio
  1. Open your Synapse workspace in Synapse Studio. See Open Synapse Studio.

  2. In Synapse Studio, Make sure that your identity is assigned the role of Synapse Administrator. See Synapse RBAC: Synapse Administrator role for the workspace.

  3. Create an Apache Spark pool. See Create a serverless Apache Spark pool.

Set up and run the sample notebook

In this section, you'll generate statistical data that you'll visualize in a report. To simplify this tutorial, this section uses a sample configuration file and a sample PySpark notebook. The notebook contains a collection of queries that execute in Azure Synapse Studio.

Modify and upload the sample configuration file
  1. Download the BlobInventoryStorageAccountConfiguration.json file.

  2. Update the following placeholders of that file:

  3. Upload this file to the container in your primary storage account that you specified when you created the Synapse workspace.

Import the sample PySpark notebook
  1. Download the ReportAnalysis.ipynb sample notebook.

    Note

    Make sure to save this file with the .ipynb extension.

  2. Open your Synapse workspace in Synapse Studio. See Open Synapse Studio.

  3. In Synapse Studio, select the Develop tab.

  4. Select the plus sign (+) to add an item.

  5. Select Import, browse to the sample file that you downloaded, select that file, and select Open.

    The Properties dialog box appears.

  6. In the Properties dialog box, select the Configure session link.

    The Configure session dialog box opens.

  7. In the Attach to drop-down list of the Configure session dialog box, select the Spark pool that you created earlier in this article. Then, select the Apply button.

Modify the Python notebook
  1. In the first cell of the Python notebook, set the value of the storage_account variable to the name of the primary storage account.

  2. Update the value of the container_name variable to the name of the container in that account that you specified when you created the Synapse workspace.

  3. Select the Publish button.

Run the PySpark notebook
  1. In the PySpark notebook, select Run all.

    It will take a few minutes to start the Spark session and another few minutes to process the inventory reports. The first run could take a while if there are numerous inventory reports to process. Subsequent runs will only process the new inventory reports created since the last run.

    Note

    If you make any changes to the notebook will the notebook is running, make sure to publish those changes by using the Publish button.

  2. Verify that the notebook ran successfully by selecting the Data tab.

    A database named reportdata should appear in the Workspace tab of the Data pane. If this database doesn't appear, then you might have to refresh the web page.

    The database contains a set of tables. Each table contains information obtained by running the queries from the PySpark notebook.

  3. To examine the contents of a table, expand the Tables folder of the reportdata database. Then, right-click a table, select Select SQL script, and then select Select TOP 100 rows.

  4. You can modify the query as needed and then select Run to view the results.

Visualize the data
  1. Download the ReportAnalysis.pbit sample report file.

  2. Open Power BI Desktop. For installation guidance, see Get Power BI Desktop.

  3. In Power BI, select File, Open report, and then Browse reports.

  4. In the Open dialog box, change the file type to Power BI template files (*.pbit).

  5. Browse to the location of the ReportAnalysis.pbit file that you downloaded, and then select Open.

    A dialog box appears which asks you to provide the name of the Synapse workspace and the data base name.

  6. In the dialog box, set the synapse_workspace_name field to the workspace name and set the database_name field to reportdata. Then, select the Load button.

    A report appears which provides visualizations of the data retrieved by the notebook. The following images show the types of the charts and graphs that appear in this report.

Next steps

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4