Partitioned Data Manipulation Language (partitioned DML) is designed for the following types of bulk updates and deletes:
NULL
.UPDATE
statement to set a new column's value to False
where it is currently NULL
.Partitioned DML isn't suitable for small-scale transaction processing. If you want to run a statement on a few rows, use transactional DMLs with identifiable primary keys. For more information, see Using DML.
If you need to commit a large number of blind writes, but don't require an atomic transaction, you can bulk modify your Spanner tables using batch write. For more information, see Modify data using batch writes.
You can get insights on active partitioned DML queries and their progress from statistics tables in your Spanner database. For more information, see Active partitioned DMLs statistics.
DML and partitioned DMLSpanner supports two execution modes for DML statements:
DML, which is suitable for transaction processing. For more information, see Using DML.
Partitioned DML, which enables large-scale, database-wide operations with minimal impact on concurrent transaction processing by partitioning the key space and running the statement over partitions in separate, smaller-scoped transactions. For more information, see Using partitioned DML.
The following table highlights some of the differences between the two execution modes.
DML Partitioned DML Rows that don't match theWHERE
clause might be locked. Only rows that match the WHERE
clause are locked. Transaction size limits apply. Spanner handles the transaction limits and per-transaction concurrency limits. Statements don't need to be idempotent. A DML statement must be idempotent to ensure consistent results. A transaction can include multiple DML and SQL statements. A partitioned transaction can include only one DML statement. There are no restrictions on complexity of statements. Statements must be fully partitionable. You create read-write transactions in your client code. Spanner creates the transactions. Partitionable and idempotent
When a partitioned DML statement runs, rows in one partition don't have access to rows in other partitions, and you cannot choose how Spanner creates the partitions. Partitioning ensures scalability, but it also means that partitioned DML statements must be fully partitionable. That is, the partitioned DML statement must be expressible as the union of a set of statements, where each statement accesses a single row of the table and each statement accesses no other tables. For example, a DML statement that accesses multiple tables or performs a self-join is not partitionable. If the DML statement is not partitionable, Spanner returns the error BadUsage
.
These DML statements are fully partitionable, because each statement can be applied to a single row in the table:
UPDATE Singers SET LastName = NULL WHERE LastName = '';
DELETE FROM Albums WHERE MarketingBudget > 10000;
This DML statement is not fully partitionable, because it accesses multiple tables:
# Not fully partitionable
DELETE FROM Singers WHERE
SingerId NOT IN (SELECT SingerId FROM Concerts);
Spanner might execute a partitioned DML statement multiple times against some partitions due to network-level retries. As a result, a statement might be executed more than once against a row. The statement must therefore be idempotent to yield consistent results. A statement is idempotent if executing it multiple times against a single row leads to the same result.
This DML statement is idempotent:
UPDATE Singers SET MarketingBudget = 1000 WHERE true;
This DML statement is not idempotent:
UPDATE Singers SET MarketingBudget = 1.5 * MarketingBudget WHERE true;
Row locking
Spanner acquires a lock only if a row is a candidate for update or deletion. This behavior is different from DML execution, which might read-lock rows that don't match the WHERE
clause.
Whether a DML statement is partitioned or not depends on the client library method that you choose for execution. Each client library provides separate methods for DML execution and Partitioned DML execution.
You can execute only one partitioned DML statement in a call to the client library method.
Spanner does not apply the partitioned DML statements atomically across the entire table. Spanner does, however, apply partitioned DML statements atomically across each partition.
Partitioned DML does not support commit or rollback. Spanner executes and applies the DML statement immediately.
UNIQUE INDEX
, and violations of ON DELETE NO ACTION
. Depending on the point in time when the execution failed, the statement might have successfully run against some partitions, and might never have been run against other partitions.If the partitioned DML statement succeeds, then Spanner ran the statement at least once against each partition of the key range.
Count of modified rowsA partitioned DML statement returns a lower bound on the number of modified rows. It might not be an exact count of the number of rows modified, because there is no guarantee that Spanner counts all the modified rows.
Transaction limitsSpanner creates the partitions and transactions that it needs to execute a partitioned DML statement. Transaction limits or per-transaction concurrency limits apply, but Spanner attempts to keep the transactions within the limits.
Spanner allows a maximum of 20,000 concurrent partitioned DML statements per database.
Unsupported featuresSpanner does not support some features for partitioned DML:
INSERT
is not supported.For complex scenarios, such as moving a table or transformations that require joins across tables, consider using the Dataflow connector.
ExamplesThe following code example updates the MarketingBudget
column of the Albums
table.
You use the ExecutePartitionedDml()
function to execute a partitioned DML statement.
You use the ExecutePartitionedUpdateAsync()
method to execute a partitioned DML statement.
You use the PartitionedUpdate()
method to execute a partitioned DML statement.
You use the executePartitionedUpdate()
method to execute a partitioned DML statement.
You use the runPartitionedUpdate()
method to execute a partitioned DML statement.
You use the executePartitionedUpdate()
method to execute a partitioned DML statement.
You use the execute_partitioned_dml()
method to execute a partitioned DML statement.
You use the execute_partitioned_update()
method to execute a partitioned DML statement.
The following code example deletes rows from the Singers
table, based on the SingerId
column.
Learn how to modify data Using DML.
Learn about Data Manipulation Language (DML) best practices.
To learn about the differences between DML and mutations, see Compare DML and Mutations
Consider using the Dataflow connector for other data transformation scenarios.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4