RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://docs.databricks.com/aws/en/release-notes/runtime/16.4lts below:

Databricks Runtime 16.4 LTS | Databricks Documentation

Databricks Runtime 16.4 LTS

The following release notes provide information about Databricks Runtime 16.4 LTS, powered by Apache Spark 3.5.2.

Databricks released this LTS version in May 2025. There are 2 variants for this release, one supporting Scala 2.12 and one supporting Scala 2.13.

Starting with DBR 17 (Spark 4), only Scala 2.13 will be supported. To help you transition, two images are available in 16.4 LTS: one with Scala 2.12 and one with Scala 2.13. Use the 2.13 image to test and update your code base to Scala 2.13 before migrating to DBR 17.

Scala 2.13 migration guidanceâ

All other features are the same in both Databricks Runtime 16.4 LTS releases.
Dependency versions may differ between the Databricks Runtime 16.4 Scala 2.12 image and the 16.4 Scala 2.13 image. The versions of the libraries supported in each image are listed below on this page.
Ensure your code is written in Scala 2.13 and your executable is compiled with Scala 2.13.
Build your project against Spark 3.5.

If your code uses the sparklyr R library, you must use the image that supports Scala 2.12 instead.

Breaking changes between Scala 2.12 and 2.13â Major breaking changesâ

Databricks Runtime considers a breaking change major when it requires you to make significant code changes to support it.

Collection incompatibility: Read this official Scala docs page for details on migrating collections to Scala 2.13. If your code uses an earlier version of Scala code, collections will be primary source of incompatibilities when using Databricks, particularly with API parameters and return types.
Hash algorithm: When reviewing your code built with Scala 2.12, do not rely on the implicit order of data structures that do not guarantee ordering, such as HashMap and Set. Collections that do implicit ordering may order their elements differently when run under 2.13 (vs. 2.12) when you iterate through them.

Minor breaking changesâ

Databricks Runtime considers a breaking change minor when there are no specific error messages raised due to version changes, but your code would generally fail to compile under the version. In this case, the error messages may provide sufficient information for updating your code.

Stricter type inference: When compiling code developed for Scala 2.12 with 2.13, the compiler may report that it cannot infer the type or hint at type ambiguity. Explicitly annotating the variables with their data type often solves this problem.
Removal of specific syntax: Some syntax has been deprecated in Scala 2.13, such as the single-quote string literal, the + operator for string concatenation when used with a non-String type on the left, and the postfix operator (use dot notation instead). For more details, see: Scala dropped features in the official Scala docs.
Output messages: Text results for messages returned from REPL can be different between Scala 2.12 and 2.13. Any discrepancy doesn't mean the syntax or computation is wrong; rather, the message describes the issue or context differently or presents it in a different representational format. For example, f: Foo = Foo(1) may become f: Foo = Foo(i = 1) in some messages.

The list of library versions supported by each Scala version of Databricks Runtime 16.4 LTS:

New features and improvementsâ

Dashboards, alerts, and queries are supported as workspace filesâ

Dashboards, alerts, and queries are now supported as workspace files. You can now programmatically interact with these Databricks objects from anywhere the workspace filesystem is available, including writing, reading, and deleting them like any other file. To learn more, see What are workspace files? and Programmatically interact with workspace files.

Liquid clustering auto-compaction improvementâ

Unity Catalog-managed liquid clustering tables now trigger auto-compaction to automatically reduce small file problems between OPTIMIZE runs.

For more details, see Auto compaction for Delta Lake on Databricks.

Auto Loader can now clean processed files in the source directoryâ

Customers can now instruct Auto Loader to automatically move or delete files that have been processed. Opt in to this feature by using the cloudFiles.cleanSource Auto Loader option.

For more details, see Auto Loader options under cloudFiles.cleanSource.

Type widening support added for streaming from Delta tablesâ

This release adds support for streaming from a Delta table that has type-widened column data, and for sharing a Delta table with type widening enabled using Databricks-to-Databricks Delta Sharing. The type widening feature is currently in Public Preview.

For more details, see Type widening.

IDENTIFIER support now available in DBSQL for catalog operationsâ

Databricks customers can now use the IDENTIFIER clause when performing the following catalog operations:

CREATE CATALOG
DROP CATALOG
COMMENT ON CATALOG
ALTER CATALOG

This new syntax allows customers to dynamically specify catalog names using parameters defined for these operations, enabling more flexible and reusable SQL workflows. As an example of the syntax, consider CREATE CATALOG IDENTIFIER(:param) where param is a parameter provided to specify a catalog name.

For more details, see IDENTIFIER clause.

Collated expressions now provide autogenerated transient aliasesâ

Autogenerated aliases for collated expressions will now always deterministically incorporate COLLATE information. Autogenerated aliases are transient (unstable) and should not be relied on. Instead, as a best practice, use expression AS alias consistently and explicitly.

Add filter pushdown API support to Python data sourcesâ

In Databricks Runtime 16.4 LTS, Databricks added support for filter pushdown to Python data source batch read as an a API similar to SupportsPushDownFilters interface. You can now implement DataSourceReader.pushFilters to receive filters that may be pushed down. You can also implement this API to provide logic to select filters to push down, to track them, and to return the remaining filters for application by Apache Spark.

Filter pushdown allows the data source to handle a subset of filters. This can improve performance by reducing the amount of data that needs to be processed by Spark.

Filter pushdown is only supported for batch reads, not for streaming reads. The new API must be added to DataSourceReader and not to DataSource or DataSourceStreamReader. The list of filters must be interpreted as the logical AND of the elements in your implementation. This method is called once during query planning. In the default implementation, it returns all filters which indicates that no filters can be pushed down. Use subclasses to override this method and implement your logic for filter pushdown.

Initially and to keep the API simple, Databricks only supports V1 filters that have a column, a boolean operator, and a literal value. The filter serialization is a placeholder and will be implemented in a future PR. For example:

Python

class DataSourceReader(ABC):

  ...
  def pushFilters(self, filters: List["Filter"]) -> Iterable["Filter"]:

Databricks recommends you implement this method only for data sources that natively support filtering, such as databases and GraphQL APIs.

Python UDF traceback improvementâ

The Python UDF traceback now includes frames from both the driver and executor along with client frames, resulting in better error messages that show greater and more relevant details (such as the line content of frames inside a UDF).

UNION/EXCEPT/INTERSECT inside a view and EXECUTE IMMEDIATE now return correct resultsâ

Queries for temporary and persistent view definitions with top-level UNION/EXCEPT/INTERSECT and un-aliased columns previously returned incorrect results because UNION/EXCEPT/INTERSECT keywords were considered aliases. Now those queries will correctly perform the whole set operation.

EXECUTE IMMEDIATE ... INTO with a top-level UNION/EXCEPT/INTERSECT and un-aliased columns also wrote an incorrect result of a set operation into the specified variable due to the parser interpreting these keywords as aliases. Similarly, SQL queries with invalid tail text were also allowed. Set operations in these cases now write a correct result into the specified variable, or fail in case of invalid SQL text.

Data source cached plan conf and migration guideâ

Since DBR 16.4.0, reading from a file source table will correctly respect query options, e.g. delimiters. Previously, the first query plan was cached and subsequent option changes ignored. To restore the previous behavior, set spark.sql.legacy.readFileSourceTableCacheIgnoreOptions to true.

New listagg and string_agg functionsâ

Starting with this release you can use the listagg or string_agg functions to aggregate STRING and BINARY values within a group. See string_agg for more details.

Support for MERGE INTO to tables with fine-grained access control on dedicated compute is now generally available (GA)â

In Databricks Runtime 16.3 and above, dedicated compute supports MERGE INTO to Unity Catalog tables that use fine-grained access control. This feature is now generally available.

See Fine-grained access control on dedicated compute.

DBR 16.4 LTS behavioral changesâ

Delta tables: "DESCRIBE DETAIL table" will now show the clusterByAuto status of the tableâ

DESCRIBE DETAIL {table} will now show the clusterByAuto status of the table (true or false) next to the current clustering columns. For more details on clusterByAuto, see:Automatic liquid clustering.

Fix to respect options for data source cached plansâ

This update ensures table reads respect options set for all data source plans when cached, not just the first cached table read.

Previously, data source table reads cached the first plan but failed to account for different options in subsequent queries.

For example, the following query:

spark.sql("CREATE TABLE t(a string, b string) USING CSV".stripMargin)
spark.sql("INSERT INTO TABLE t VALUES ('a;b', 'c')")

spark.sql("SELECT * FROM t").show()
spark.sql("SELECT * FROM t WITH ('delimiter' = ';')")

would produce this output:

+----+----+
|col1|col2|
+----+----+
| a;b| c  |
+----+----+

+----+----+
|col1|col2|
+----+----+
| a;b| c  |
+----+----+

With this fix, it now returns the expected output:

+----+----+
|col1|col2|
+----+----+
| a;b| c  |
+----+----+

+----+----+
|col1|col2|
+----+----+
| a  | b,c|
+----+----+

If your workloads have dependencies on the previous incorrect behavior, you may see different results after this change.

Moved redaction rule from analyzer to optimizerâ

Previously, DataFrames could create tables that contained redacted values when valid SECRET SQL functions were used. This change removes redaction when saving DataFrames with valid secret access to a table, and the redaction rule has moved from the analyzer to the optimizer.

variant_get and get_json_object now consider leading spaces in paths in Apache Sparkâ

Prior to this change, leading whitespaces and tabs in paths in the variant_get and get_json_object expressions were being ignored with Photon disabled. For example, select get_json_object('{" key": "value"}', '$[' key']') would not be effective in extracting the value of " key". However, users will be able to extract such keys.

Enable flag to disallow disabling source materialization for MERGE operationsâ

Previously, users could disable source materialization in MERGE by setting merge.materializeSource to none. With the new flag enabled, this will be forbidden and cause an error. Databricks plans to enable the flag only for customers who haven't used this configuration flag before, so no customer should notice any change in behavior.

Move partition metadata log enablement anchor to tableâ

The partition metadata log feature has been changed so that once a table is created with spark.databricks.nonDelta.partitionLog.enabled = true, you can anchor it to a table so a cluster doesn't set spark.databricks.nonDelta.partitionLog.enabled = true for all tables processed by the cluster.

Update snowflake-jdbc 3.16.1 -> 3.22.0â

Updated the dependency snowflake-jdbc from 3.16.1 to 3.22.0. This may impact users if they directly use the l3.16.1 version of the library.

Downgrade Json4s from 4.0.7 to 3.7.0-M11 in DBR 16.2 and 16.3â

Customers cannot use databricks-connect 16.1+ and Apache Sparkâ¢ 3.5.x together in the same application because of significant discrepancies in API behavior between Json4s version 3.7.0-M11 and version 4.0.7. To address this, Databricks has downgraded Json4s to 3.7.0-M11.

Library upgrades (applies to the Scala 2.12 image only)â

Upgraded R libraries:
- sparklyr from 1.8.4 to 1.8.6 (only applies to the Scala 2.12 image for the Databricks Runtime 16.4 LTS release)
Upgraded Java libraries:
- io.delta.delta-sharing-client_2.12 from 1.2.3 to 1.2.4
- org.json4s.json4s-ast_2.12 from 4.0.7 to 3.7.0-M11
- org.json4s.json4s-core_2.12 from 4.0.7 to 3.7.0-M11
- org.json4s.json4s-jackson_2.12 from 4.0.7 to 3.7.0-M11
- org.json4s.json4s-scalap_2.12 from 4.0.7 to 3.7.0-M11

The Scala 2.13 Databricks Runtime release is considered a "new" version and may have different library versions from 2.12. Refer to the table below for specific library versions in that release image. It does not include sparklyr in this release.

Apache Sparkâ

Databricks Runtime 16.4 LTS includes Apache Spark 3.5.2. This release includes all Spark fixes and improvements included in Databricks Runtime 16.3, as well as the following additional bug fixes and improvements made to Spark:

[SPARK-51206] [PYTHON][connect] Move Arrow conversion helpers out of Spark Connect
[SPARK-51566] [PYTHON] Python UDF traceback improvement
[SPARK-51675] [SS] Fix col family creation after opening local DB to avoid snapshot creation, if not necessary
[SPARK-50892] [SQL]Add UnionLoopExec, physical operator for recursion, to perform execution of recursive queries
[SPARK-51670] [SQL] Refactor Intersect and Except to follow Union example to reuse in single-pass Analyzer
[SPARK-47895] [BEHAVE-266][sql] group by alias should be idempotent
[SPARK-51614] [SQL] Introduce ResolveUnresolvedHaving rule in the Analyzer
[SPARK-51662] [SQL] Make OrcFileFormat comparable
[SPARK-51374] Revert "[CORE] Switch to Using java.util.Map in Logging APIs"
[SPARK-51652] [SQL] Refactor SetOperation computation out to reuse it in the single-pass Analyzer
[SPARK-51622] [UI] Titling sections on ExecutionPage
[SPARK-51586] [SS] initialize input partitions independent of columnar support in continuous mode
[SPARK-51070] [SQL] Use scala.collection.Set instead of Set in ValidateExternalType
[SPARK-51559] [SQL] Make max broadcast table size configurable
[SPARK-51496] [SQL] CaseInsensitiveStringMap comparison should ignore case
[SPARK-50416] [CORE] A more portable terminal / pipe test needed for bin/load-spark-env.sh
[SPARK-51509] [CORE][ui] Make Spark Master Environment page support filters
[SPARK-51374] [CORE] Switch to Using java.util.Map in Logging APIs
[SPARK-50820] [SQL] DSv2: Conditional nullification of metadata columns in DML
[SPARK-48922] [SQL] Avoid redundant array transform of identical expression for map type
[SPARK-51446] [SQL] Improve the codecNameMap for the compression codec
[SPARK-51341] [CORE] Cancel time task with suitable way.
[SPARK-51452] [UI] Improve Thread dump table search
[SPARK-51444] [CORE] Remove the unreachable if branch from TaskSchedulerImpl#statusUpdate
[SPARK-51401] [SQL] Change ExplainUtils.generateFieldString to directly call QueryPlan.generateFieldString
[SPARK-49507] [SQL] Fix the case issue after enabling metastorePartitionPruningFastFallback
[SPARK-51506] [PYTHON][ss] Do not enforce users to implement close() in TransformWithStateInPandas
[SPARK-51624] [SQL] Propagate GetStructField metadata in CreateNamedStruct.dataType
[SPARK-51201] [SQL] Make Partitioning Hints support byte and short values
[SPARK-51186] [PYTHON] Add StreamingPythonRunnerInitializationException to PySpark base exception
[SPARK-50286] [SQL] Correctly propagate SQL options to WriteBuilder
[SPARK-51616] [SQL] Run CollationTypeCasts before ResolveAliases and ResolveAggregateFunctions
[SPARK-51023] [CORE] log remote address on RPC exception
[SPARK-51087] [PYTHON][connect] Raise a warning when memory-profiler is not installed for memory profiling
[SPARK-51073] [SQL] Remove Unstable from SparkSessionExtensionsProvider trait
[SPARK-51471] [SS] RatePerMicroBatchStream - classify the ASSERT error when offset/timestamp in startOffset is larger than the endOffset
[SPARK-51062] [PYTHON] Fix assertSchemaEqual to compare decimal precision and scale
[SPARK-51612] [SQL] Display Spark confs set at view creation in Desc As Json
[SPARK-51573] [SS] Fix Streaming State Checkpoint v2 checkpointInfo race condition
[SPARK-51252] [SS] Add instance metrics for last uploaded snapshot version in HDFS State Stores
[SPARK-51552] [SQL] Disallow temporary variables in persisted views when under identifier
[SPARK-51625] [SQL] Command in CTE relations should trigger inline
[SPARK-51584] [SQL] Add rule that pushes Project through Offset and Suite that tests it
[SPARK-51593] [SQL] Refactor QueryExecutionMetering instantiation
[SPARK-49082] [SQL] Support widening Date to TimestampNTZ in Avro reader
[SPARK-50220] [PYTHON] Support listagg in PySpark
[SPARK-51581] [CORE][sql] Use nonEmpty/isEmpty for empty check for explicit Iterable
[SPARK-42746] [SQL] Fix in error class classification for new exceptions with listagg
[SPARK-51580] [SQL] Throw proper user facing error message when lambda function is out of place in HigherOrderFunction
[SPARK-51589] [SQL] Fix small bug failing to check for aggregate functions in |> SELECT
[SPARK-50821] [PYTHON] Upgrade Py4J from 0.10.9.8 to 0.10.9.9
[SPARK-42746] [SQL] Implement LISTAGG function
[SPARK-48399] [SQL] Teradata: ByteType should map to BYTEINT instead of BYTE(binary)
[SPARK-43221] [CORE] Host local block fetching should use a block status of a block stored on disk
[SPARK-51467] [UI] Make tables of the environment page filterable
[SPARK-51569] [SQL] Don't reinstantiate InSubquery in InTypeCoercion if there are no type changes
[SPARK-51280] [CONNECT] Improve RESPONSE_ALREADY_RECEIVED error class
[SPARK-51525] [SQL] Collation field for Desc As JSON StringType
[SPARK-51565] [SQL] Support SQL parameters in window frame clause
[SPARK-51544] [SQL] Add only unique and necessary metadata columns
[SPARK-49349] [SQL] Improve error message for LCA with Generate
[SPARK-51418] [SQL] Fix DataSource PARTITON TABLE w/ Hive type incompatible partition columns
[SPARK-51269] [SQL] Simplify AvroCompressionCodec by removing defaultCompressionLevel
[SPARK-50618] [SS][sql] Make DataFrameReader and DataStreamReader leverage the analyzer more
[SPARK-51097] [SS] Re-introduce RocksDB state store's last uploaded snapshot version instance metrics
[SPARK-51321] [SQL] Add rpad and lpad support for PostgresDialect and MsSQLServerDialect expression pushdown
[SPARK-51428] [BEHAVE-264][sql] Reassign Aliases for collated expression trees deterministically
[SPARK-51528] [SQL] Don't compare hash codes directly in ProtobufCatalystDataConversionSuite
[SPARK-51443] [SS] Fix singleVariantColumn in DSv2 and readStream.
[SPARK-51397] [SS] Fix maintenance pool shutdown handling issue causing long test times
[SPARK-51522] [SQL] Disable lazy union children test for single-pass Analyzer
[SPARK-51079] [Fix-forward] Support large variable types in pandas UDF, createDataFrame and toPandas with Arrow
[SPARK-51112] [ES-1348599][connect] Avoid using pyarrow's to_pandas on an empty table
[SPARK-51438] [SQL] Make CatalystDataToProtobuf and ProtobufDataToCatalyst properly comparable and hashable
[SPARK-50855] [DBR16.x][ss][CONNECT] Spark Connect Support for TransformWithState In Scala
[SPARK-51208] [SQL] ColumnDefinition.toV1Column should preserve EXISTS_DEFAULT resolution
[SPARK-51468] [SQL] Revert "From json/xml should not change collations in the given schema"
[SPARK-51453] [SQL] AssertTrue uses toPrettySQL instead of simpleString
[SPARK-51409] [SS] Add error classification in the changelog writer creation path
[SPARK-51440] [SS] classify the NPE when null topic field value is in kafka message data and there is no topic option
[SPARK-51425] [Connect] Add client API to set custom operation_id
[SPARK-50652] [SS] Add checks to RocksDB V2 backward compatibility
[SPARK-49164] [SQL] Fix not NullSafeEqual in predicate of SQL query in JDBC Relation
[SPARK-50880] [SQL] Add a new visitBinaryComparison method to V2ExpressionSQLBuilder
[SPARK-50838] [SQL]Performs additional checks inside recursive CTEs to throw an error if forbidden case is encountered
[SPARK-51307] [SQL] locationUri in CatalogStorageFormat shall be decoded for display
[SPARK-51079] [PYTHON] Support large variable types in pandas UDF, createDataFrame and toPandas with Arrow
[SPARK-50792] [SQL] Format binary data as a binary literal in JDBC.

Databricks ODBC/JDBC driver supportâ

Databricks supports ODBC/JDBC drivers released in the past 2 years. Download the recently released drivers here:

System environmentâ

Operating System: Ubuntu 24.04.2 LTS.
- Note: This is the Ubuntu version used by the Databricks Runtime containers. The Databricks Runtime containers run on the cloud provider's virtual machines, which might use a different Ubuntu version or Linux distribution.
Java: Zulu17.54+21-CA
Scala: 2.12.15 or 2.13.10
Python: 3.12.3
R: 4.4.0
Delta Lake: 3.3.1

Installed Python librariesâ Installed R librariesâ

R libraries are installed from the Posit Package Manager CRAN snapshot on 2024-08-04.

note

sparklyr is only supported in the Databricks Runtime 16.4 LTS release image with support for Scala 2.12. It is not supported in the DBR 16.4 release image with Scala 2.13 support.

Installed Java and Scala libraries (Scala 2.13 cluster version)â Installed Java and Scala libraries (Scala 2.12 cluster version)â

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4