A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.snowflake.com/en/user-guide/data-quality-intro below:

Website Navigation


Introduction to data quality and data metric functions

Introduction to data quality and data metric functions

Data quality uses data metric functions (DMFs), which include Snowflake-provided system DMFs and user-defined DMFs, to monitor the state and integrity of your data. You can use DMFs to measure key metrics, such as, but not limited to, freshness and counts that measure duplicates, NULLs, rows, and unique values.

About data quality and DMFs

Data quality focuses on knowing the state and integrity of your data, which includes data freshness and accuracy with respect to true data values compared to null values or blank fields in a column, to make data-driven decisions. You can measure the quality of your data by using DMFs. Snowflake provides built-in system DMFs in the SNOWFLAKE.CORE schema to measure common metrics without having to define them. You can also define your own custom DMFs to fine-tune your data quality measurements more precisely, and these DMFs are stored in the database and schema of your choice.

Whether you use system DMFs, custom DMFs, or both, after you assign a DMF to a table or view, Snowflake records the results of scheduling the DMF in a dedicated event table for data metric functions. You can specify the frequency for how often the DMF is called. For example, you can schedule the DMFs on a particular table to run three times daily. You can modify the frequency as needed based on your own internal data quality requirements. All DMFs that are set on the table follow the same schedule.

After you schedule the DMFs to run, you can configure alerts to notify you when changes to data quality occur. By combining the DMF and alert functionality, you can have consistent threshold notifications for data quality on the tables that you measure. These insights enhance your data governance posture by enabling the following:

The data quality workflow of defining, measuring, and monitoring data can then be applied to additional workloads.

Supported table kinds

You can set a DMF on the following kinds of table objects:

You cannot set a DMF on a hybrid table or a stream object.

Billing and pricing

Snowflake creates and manages the virtual warehouse objects to support this feature. You can use this feature without having to provision or use any additional virtual warehouse resources outside of your normal SQL query workflow.

Calling a DMF uses serverless compute resources as described in more detail in the Snowflake Service Consumption Table. The table lists the credit pricing per cloud region and Snowflake Edition.

The credits that you use are listed in the “Data Quality Monitoring” category on your monthly bill. These credits include compute consumed by all system or user-defined data quality metrics that you use. You are not billed for creating a DMF.

For more information, see Serverless credit usage.

Benefits

Using DMFs enhances your data quality efforts and provides these benefits:

Facilitate compliance:

By knowing the state of your data, it’s easier to demonstrate how you adhere to compliance and regulatory standards. This decreases your risk and enhances your data governance posture.

Meet service-level agreements (SLAs):

Accurate data metrics, such as freshness, help to meet SLAs between data providers, data consumers, and clients.

Credibility:

DMFs provide data validation, which facilitates trustworthy data-driven decision making.

Consistency:

Using system DMFs and repeatedly using proper custom DMFs leads to consistent data quality evaluation over time. This enhances the credibility of your data.

Optimize for specific use cases:

Custom DMFs enable data engineers to design accurate metrics to measure the data, which leads to more accurate optimizations for targeted applications of the data.

Automated measurement:

Calling the DMF is automated after you assign the DMF to a table or view, and specify the schedule for the DMF to run. No additional work is needed to actively measure your data quality. For more information, see Schedule the DMF to run and View results of a data metric function

Performance management:

Snowflake evaluates how you use DMFs and optimizes the query evaluation to provide optimal query performance.

Considerations Limitations

Note the following limitations when using DMFs:

Access control privileges

These sections summarize information on how to manage DMFs:

Additionally, you can grant application roles to manage access to the results of DMFs. For more information, see Managing access to the DMF results

DMF privileges

Snowflake supports the following privileges to use with a DMF.

The USAGE privilege on the parent database and schema are required to perform operations on any object in a schema.

Privilege

Object

Usage

CREATE DATA METRIC FUNCTION

Schema

Enables creating a new DMF in a schema.

USAGE

Data metric function

Enables calling the DMF.

EXECUTE DATA METRIC FUNCTION

Account

Enables using serverless compute resources when calling a DMF.

OWNERSHIP

Data metric function

Transfers ownership of the DMF, which grants full control over the DMF. Required to alter most properties of the DMF.

Additionally, if you’d like to manage the assignment of a DMF on a table or view, use a role with these privileges:

Privilege

Object

Notes

OWNERSHIP

Table or view

OWNERSHIP is a special privilege on an object that is automatically granted to the role that created the object, but can also be transferred using the GRANT OWNERSHIP command to a different role by the owning role (or any role with the MANAGE GRANTS privilege).

USAGE

Data metric function

This privilege is for user-defined DMFs.

For system DMFs, grant the SNOWFLAKE.DATA_METRIC_USER database role to the table owner role.

USAGE

Database, schema

These objects are the database and schema that store the user-defined DMF.

EXECUTE DATA METRIC FUNCTION

Account

This privilege enables you to control which roles have access to serverless compute resources.

DMF command reference

Snowflake supports the following commands to work with DMFs:

Additionally, use the ALTER TABLE and the ALTER VIEW commands to do the following:

For representative examples, see Use data metric functions to perform data quality checks.

When using the GRANT <privileges> … TO ROLE and REVOKE <privileges> … FROM ROLE commands, you can:

You can use the GRANT OWNERSHIP command to transfer ownership to an account role.

The GRANT and REVOKE commands require you to specify the arguments of the user-defined DMF that you create. For example:

GRANT USAGE ON FUNCTION
  governance.dmfs.count_positive_numbers(TABLE(NUMBER, NUMBER, NUMBER))
  TO data_engineer;

Copy

Summary of DMF commands, operations, and privileges

The following table summarizes the relationship between DMF privileges and DDL operations.

The USAGE privilege on the parent database and schema are required to perform operations on any object in a schema.

In addition to the requirements in the table, if you’re using a system DMF, use the ACCOUNTADMIN role or a role that is granted IMPORTED PRIVILEGES on the shared SNOWFLAKE database.

Operation

Privilege required

Use a system DMF.

See System data metric functions.

Create a DMF with the CREATE DATA METRIC FUNCTION command.

A role with the CREATE DATA METRIC FUNCTION privilege on the schema.

Modify a DMF with an ALTER FUNCTION command.

A role with the OWNERSHIP privilege on the DMF.

Add a DMF or drop a DMF from a table or view with an ALTER TABLE or ALTER VIEW command.

A role with the following privileges:

Identify metric references by calling the DATA_METRIC_FUNCTION_REFERENCES function.

Both of the following:

Describe a DMF with a DESCRIBE FUNCTION command.

A role with the USAGE privilege on the DMF.

Drop a DMF from the current or specified schema with a DROP FUNCTION command.

A role with the OWNERSHIP privilege on the DMF.

List data metric functions with a SHOW DATA METRIC FUNCTIONS or SHOW FUNCTIONS command.

A role with the USAGE privilege on the DMF.

Call a DMF manually using SELECT.

A role with the global EXECUTE DATA METRIC FUNCTION privilege or a role with the USAGE privilege on the DMF.

View DMF results.

For more information, see View results of a data metric function.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4