note
Databricks enables predictive optimization by default for all accounts created on or after November 11, 2024. Starting May 7, 2025, Databricks will enable predictive optimization by default for all existing Databricks accounts. This will roll out gradually based on your region and will be completed by mid-October, 2025. You can check whether predictive optimization is enabled for your account.
Predictive optimization removes the need to manually manage maintenance operations for Unity Catalog managed tables on Databricks.
With predictive optimization enabled, Databricks automatically does the following:
Maintenance operations are run as necessary, eliminating both unnecessary runs for maintenance operations and the burden associated with tracking and troubleshooting performance.
Databricks recommends using predictive optimization for all Unity Catalog managed tables. For example, automatic liquid clustering has intelligent optimization of data layout based on your data usage patterns. See Use liquid clustering for tables.
What operations does predictive optimization run?âPredictive optimization runs the following operations automatically for enabled tables:
(1) OPTIMIZE
does not run ZORDER
when executed with predictive optimization. On tables that use Z-order, predictive optimization will ignore the Z-ordered files.
If automatic liquid clustering is enabled, predictive optimization might select new clustering keys before clustering data. See Automatic liquid clustering.
warning
The retention window for the VACUUM
command is determined by the delta.deletedFileRetentionDuration
table property, which defaults to 7 days. This means VACUUM
removes data files that are no longer referenced by a Delta table version in the last 7 days. If you'd like to retain data for longer (such as to support time travel for longer durations), you must set this table property appropriately before you enable predictive optimization, as in the following example:
SQL
ALTER TABLE table_name SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = '30 days');
If you configure delta.deletedFileRetentionDuration
below the default of 7 days, predictive optimization runs VACUUM
with retention duration of 7 days.
Predictive optimization identifies tables that would benefit from ANALYZE
, OPTIMIZE
, and VACUUM
operations and queues them to run using serverless compute for jobs. Your account is billed for compute associated with these workloads using a serverless jobs SKU.
See pricing for Databricks managed services. See Use system tables to track predictive optimization.
Prerequisites for predictive optimizationâYou must fulfill the following requirements to enable predictive optimization:
You can enable predictive optimization for an account, a catalog, or a schema. All Unity Catalog managed tables inherit the account value by default. You can override the account default for a catalog or schema to enable or disable predictive optimization at that level.
note
If your account was created on or after November 11, 2024, predictive optimization is enabled by default. Starting May 7, 2025, predictive optimization is enabled by default for all existing accounts. This will roll out gradually based on your region and will be completed by mid-October, 2025.
You must have the following privileges to enable or disable predictive optimization at the specified level:
Enable or disable predictive optimization for your accountâAn account admin can complete the following steps to enable predictive optimization for all metastores in an account. Objects in the account will inherit this setting by default (but the setting can be overridden at the catalog or schema level):
note
Predictive optimization uses an inheritance model. When enabled for a catalog, schemas inherit the property. Tables within an enabled schema inherit predictive optimization. To override this inheritance behavior, you can explicitly enable or disable predictive optimization for a catalog or schema.
note
You can disable predictive optimization at the catalog or schema level before enabling it at the account level. If predictive optimization is later enabled on the account, it is blocked for tables in these objects.
Use the following syntax to enable or disable predictive optimization, or to return to the default of inheriting from the parent object:
SQL
ALTER CATALOG [catalog_name] { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;
ALTER { SCHEMA | DATABASE } schema_name { ENABLE | DISABLE | INHERIT } PREDICTIVE OPTIMIZATION;
Check whether predictive optimization is enabledâ
The Predictive Optimization
field is a Unity Catalog property that details if predictive optimization is enabled. If predictive optimization is inherited from a parent object, this is indicated in the field value.
Use the following syntax to see if predictive optimization is enabled:
SQL
DESCRIBE (CATALOG | SCHEMA | TABLE) EXTENDED name
Use system tables to track predictive optimizationâ
Databricks provides the system table system.storage.predictive_optimization_operations_history
for observability into predictive optimization operations, costs, and impact. See Predictive optimization system table reference.
Predictive optimization is not available in all regions. See Features with limited regional availability.
For tables with deleted file retention duration (delta.deletedFileRetentionDuration
) configured below the default of 7 days, predictive optimization runs VACUUM
with retention duration as 7 days. See Configure data retention for time travel queries.
Predictive optimization does not perform maintenance operations on the following tables:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4