A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.databricks.com/aws/en/release-notes/runtime/16.4lts below:

Databricks Runtime 16.4 LTS | Databricks Documentation

Databricks Runtime 16.4 LTS

The following release notes provide information about Databricks Runtime 16.4 LTS, powered by Apache Spark 3.5.2.

Databricks released this LTS version in May 2025. There are 2 variants for this release, one supporting Scala 2.12 and one supporting Scala 2.13.

Starting with DBR 17 (Spark 4), only Scala 2.13 will be supported. To help you transition, two images are available in 16.4 LTS: one with Scala 2.12 and one with Scala 2.13. Use the 2.13 image to test and update your code base to Scala 2.13 before migrating to DBR 17.

Scala 2.13 migration guidance​

If your code uses the sparklyr R library, you must use the image that supports Scala 2.12 instead.

Breaking changes between Scala 2.12 and 2.13​ Major breaking changes​

Databricks Runtime considers a breaking change major when it requires you to make significant code changes to support it.

Minor breaking changes​

Databricks Runtime considers a breaking change minor when there are no specific error messages raised due to version changes, but your code would generally fail to compile under the version. In this case, the error messages may provide sufficient information for updating your code.

The list of library versions supported by each Scala version of Databricks Runtime 16.4 LTS:

New features and improvements​ Dashboards, alerts, and queries are supported as workspace files​

Dashboards, alerts, and queries are now supported as workspace files. You can now programmatically interact with these Databricks objects from anywhere the workspace filesystem is available, including writing, reading, and deleting them like any other file. To learn more, see What are workspace files? and Programmatically interact with workspace files.

Liquid clustering auto-compaction improvement​

Unity Catalog-managed liquid clustering tables now trigger auto-compaction to automatically reduce small file problems between OPTIMIZE runs.

For more details, see Auto compaction for Delta Lake on Databricks.

Auto Loader can now clean processed files in the source directory​

Customers can now instruct Auto Loader to automatically move or delete files that have been processed. Opt in to this feature by using the cloudFiles.cleanSource Auto Loader option.

For more details, see Auto Loader options under cloudFiles.cleanSource.

Type widening support added for streaming from Delta tables​

This release adds support for streaming from a Delta table that has type-widened column data, and for sharing a Delta table with type widening enabled using Databricks-to-Databricks Delta Sharing. The type widening feature is currently in Public Preview.

For more details, see Type widening.

IDENTIFIER support now available in DBSQL for catalog operations​

Databricks customers can now use the IDENTIFIER clause when performing the following catalog operations:

This new syntax allows customers to dynamically specify catalog names using parameters defined for these operations, enabling more flexible and reusable SQL workflows. As an example of the syntax, consider CREATE CATALOG IDENTIFIER(:param) where param is a parameter provided to specify a catalog name.

For more details, see IDENTIFIER clause.

Collated expressions now provide autogenerated transient aliases​

Autogenerated aliases for collated expressions will now always deterministically incorporate COLLATE information. Autogenerated aliases are transient (unstable) and should not be relied on. Instead, as a best practice, use expression AS alias consistently and explicitly.

Add filter pushdown API support to Python data sources​

In Databricks Runtime 16.4 LTS, Databricks added support for filter pushdown to Python data source batch read as an a API similar to SupportsPushDownFilters interface. You can now implement DataSourceReader.pushFilters to receive filters that may be pushed down. You can also implement this API to provide logic to select filters to push down, to track them, and to return the remaining filters for application by Apache Spark.

Filter pushdown allows the data source to handle a subset of filters. This can improve performance by reducing the amount of data that needs to be processed by Spark.

Filter pushdown is only supported for batch reads, not for streaming reads. The new API must be added to DataSourceReader and not to DataSource or DataSourceStreamReader. The list of filters must be interpreted as the logical AND of the elements in your implementation. This method is called once during query planning. In the default implementation, it returns all filters which indicates that no filters can be pushed down. Use subclasses to override this method and implement your logic for filter pushdown.

Initially and to keep the API simple, Databricks only supports V1 filters that have a column, a boolean operator, and a literal value. The filter serialization is a placeholder and will be implemented in a future PR. For example:

Python

class DataSourceReader(ABC):

...
def pushFilters(self, filters: List["Filter"]) -> Iterable["Filter"]:

Databricks recommends you implement this method only for data sources that natively support filtering, such as databases and GraphQL APIs.

Python UDF traceback improvement​

The Python UDF traceback now includes frames from both the driver and executor along with client frames, resulting in better error messages that show greater and more relevant details (such as the line content of frames inside a UDF).

UNION/EXCEPT/INTERSECT inside a view and EXECUTE IMMEDIATE now return correct results​

Queries for temporary and persistent view definitions with top-level UNION/EXCEPT/INTERSECT and un-aliased columns previously returned incorrect results because UNION/EXCEPT/INTERSECT keywords were considered aliases. Now those queries will correctly perform the whole set operation.

EXECUTE IMMEDIATE ... INTO with a top-level UNION/EXCEPT/INTERSECT and un-aliased columns also wrote an incorrect result of a set operation into the specified variable due to the parser interpreting these keywords as aliases. Similarly, SQL queries with invalid tail text were also allowed. Set operations in these cases now write a correct result into the specified variable, or fail in case of invalid SQL text.

Data source cached plan conf and migration guide​

Since DBR 16.4.0, reading from a file source table will correctly respect query options, e.g. delimiters. Previously, the first query plan was cached and subsequent option changes ignored. To restore the previous behavior, set spark.sql.legacy.readFileSourceTableCacheIgnoreOptions to true.

New listagg and string_agg functions​

Starting with this release you can use the listagg or string_agg functions to aggregate STRING and BINARY values within a group. See string_agg for more details.

Support for MERGE INTO to tables with fine-grained access control on dedicated compute is now generally available (GA)​

In Databricks Runtime 16.3 and above, dedicated compute supports MERGE INTO to Unity Catalog tables that use fine-grained access control. This feature is now generally available.

See Fine-grained access control on dedicated compute.

DBR 16.4 LTS behavioral changes​ Delta tables: "DESCRIBE DETAIL table" will now show the clusterByAuto status of the table​

DESCRIBE DETAIL {table} will now show the clusterByAuto status of the table (true or false) next to the current clustering columns. For more details on clusterByAuto, see:Automatic liquid clustering.

Fix to respect options for data source cached plans​

This update ensures table reads respect options set for all data source plans when cached, not just the first cached table read.

Previously, data source table reads cached the first plan but failed to account for different options in subsequent queries.

For example, the following query:

spark.sql("CREATE TABLE t(a string, b string) USING CSV".stripMargin)
spark.sql("INSERT INTO TABLE t VALUES ('a;b', 'c')")

spark.sql("SELECT * FROM t").show()
spark.sql("SELECT * FROM t WITH ('delimiter' = ';')")

would produce this output:

+----+----+
|col1|col2|
+----+----+
| a;b| c |
+----+----+

+----+----+
|col1|col2|
+----+----+
| a;b| c |
+----+----+

With this fix, it now returns the expected output:

+----+----+
|col1|col2|
+----+----+
| a;b| c |
+----+----+

+----+----+
|col1|col2|
+----+----+
| a | b,c|
+----+----+

If your workloads have dependencies on the previous incorrect behavior, you may see different results after this change.

Moved redaction rule from analyzer to optimizer​

Previously, DataFrames could create tables that contained redacted values when valid SECRET SQL functions were used. This change removes redaction when saving DataFrames with valid secret access to a table, and the redaction rule has moved from the analyzer to the optimizer.

variant_get and get_json_object now consider leading spaces in paths in Apache Spark​

Prior to this change, leading whitespaces and tabs in paths in the variant_get and get_json_object expressions were being ignored with Photon disabled. For example, select get_json_object('{" key": "value"}', '$[' key']') would not be effective in extracting the value of " key". However, users will be able to extract such keys.

Enable flag to disallow disabling source materialization for MERGE operations​

Previously, users could disable source materialization in MERGE by setting merge.materializeSource to none. With the new flag enabled, this will be forbidden and cause an error. Databricks plans to enable the flag only for customers who haven't used this configuration flag before, so no customer should notice any change in behavior.

Move partition metadata log enablement anchor to table​

The partition metadata log feature has been changed so that once a table is created with spark.databricks.nonDelta.partitionLog.enabled = true, you can anchor it to a table so a cluster doesn't set spark.databricks.nonDelta.partitionLog.enabled = true for all tables processed by the cluster.

Update snowflake-jdbc 3.16.1 -> 3.22.0​

Updated the dependency snowflake-jdbc from 3.16.1 to 3.22.0. This may impact users if they directly use the l3.16.1 version of the library.

Downgrade Json4s from 4.0.7 to 3.7.0-M11 in DBR 16.2 and 16.3​

Customers cannot use databricks-connect 16.1+ and Apache Spark™ 3.5.x together in the same application because of significant discrepancies in API behavior between Json4s version 3.7.0-M11 and version 4.0.7. To address this, Databricks has downgraded Json4s to 3.7.0-M11.

Library upgrades (applies to the Scala 2.12 image only)​

The Scala 2.13 Databricks Runtime release is considered a "new" version and may have different library versions from 2.12. Refer to the table below for specific library versions in that release image. It does not include sparklyr in this release.

Apache Spark​

Databricks Runtime 16.4 LTS includes Apache Spark 3.5.2. This release includes all Spark fixes and improvements included in Databricks Runtime 16.3, as well as the following additional bug fixes and improvements made to Spark:

Databricks ODBC/JDBC driver support​

Databricks supports ODBC/JDBC drivers released in the past 2 years. Download the recently released drivers here:

System environment​ Installed Python libraries​ Installed R libraries​

R libraries are installed from the Posit Package Manager CRAN snapshot on 2024-08-04.

note

sparklyr is only supported in the Databricks Runtime 16.4 LTS release image with support for Scala 2.12. It is not supported in the DBR 16.4 release image with Scala 2.13 support.

Installed Java and Scala libraries (Scala 2.13 cluster version)​ Installed Java and Scala libraries (Scala 2.12 cluster version)​

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4