In this guide, you can learn how to use PyMongo to perform bulk operations. Bulk operations reduce the number of calls to the server by performing multiple write operations in a single method.
The Collection
and MongoClient
classes both provide a bulk_write()
method. When calling bulk_write()
on a Collection
instance, you can perform multiple write operations on a single collection. When calling bulk_write()
on a MongoClient
instance, you can perform bulk writes across multiple namespaces. In MongoDB, a namespace consists of the database name and the collection name in the format <database>.<collection>
.
To perform bulk operations on a MongoClient
instance, ensure that your application meets the following requirements:
Uses PyMongo v4.9 or later
Connects to MongoDB Server v8.0 or later
The examples in this guide use the sample_restaurants.restaurants
and sample_mflix.movies
collections from the Atlas sample datasets. To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see the Get Started with PyMongo tutorial.
For each write operation you want to perform, create an instance of one of the following operation classes:
InsertOne
UpdateOne
UpdateMany
ReplaceOne
DeleteOne
DeleteMany
Then, pass a list of these instances to the bulk_write()
method.
Ensure that you import the write operation classes into your application file, as shown in the following code:
from pymongo import InsertOne, UpdateOne, UpdateMany, ReplaceOne, DeleteOne, DeleteMany
The following sections show how to create instances of the preceding classes, which you can use to perform collection and client bulk operations.
To perform an insert operation, create an instance of InsertOne
and specify the document you want to insert. Pass the following keyword arguments to the InsertOne
constructor:
namespace
: The namespace in which to insert the document. This argument is optional if you perform the bulk operation on a single collection.
document
: The document to insert.
The following example creates an instance of InsertOne
:
operation = InsertOne( namespace="sample_restaurants.restaurants", document={ "name": "Mongo's Deli", "cuisine": "Sandwiches", "borough": "Manhattan", "restaurant_id": "1234" })
You can also create an instance of InsertOne
by passing an instance of a custom class to the constructor. This provides additional type safety if you're using a type-checking tool. The instance you pass must inherit from the TypedDict
class.
The TypedDict class is in the typing
module, which is available only in Python 3.8 and later. To use the TypedDict
class in earlier versions of Python, install the typing_extensions package.
The following example constructs an InsertOne
instance by using a custom class for added type safety:
class Restaurant (TypedDict): name: str cuisine: str borough: str restaurant_id: stroperation = pymongo.InsertOne(Restaurant( name="Mongo's Deli", cuisine="Sandwiches", borough="Manhattan", restaurant_id="1234"))
To insert multiple documents, create an instance of InsertOne
for each document.
In a MongoDB collection, each document must contain an _id
field with a unique value.
If you specify a value for the _id
field, you must ensure that the value is unique across the collection. If you don't specify a value, the driver automatically generates a unique ObjectId
value for the field.
We recommend letting the driver automatically generate _id
values to ensure uniqueness. Duplicate _id
values violate unique index constraints, which causes the driver to return an error.
To update a document, create an instance of UpdateOne
and pass in the following arguments:
namespace
: The namespace in which to perform the update. This argument is optional if you perform the bulk operation on a single collection.
filter
: The query filter that specifies the criteria used to match documents in your collection.
update
: The update you want to perform. For more information about update operations, see the Field Update Operators guide in the MongoDB Server manual.
UpdateOne
updates the first document that matches your query filter.
The following example creates an instance of UpdateOne
:
operation = UpdateOne( namespace="sample_restaurants.restaurants", filter={ "name": "Mongo's Deli" }, update={ "$set": { "cuisine": "Sandwiches and Salads" }})
To update multiple documents, create an instance of UpdateMany
and pass in the same arguments. UpdateMany
updates all documents that match your query filter.
The following example creates an instance of UpdateMany
:
operation = UpdateMany( namespace="sample_restaurants.restaurants", filter={ "name": "Mongo's Deli" }, update={ "$set": { "cuisine": "Sandwiches and Salads" }})
A replace operation removes all fields and values of a specified document and replaces them with new ones. To perform a replace operation, create an instance of ReplaceOne
and pass in the following arguments:
namespace
: The namespace in which to perform the replace operation. This argument is optional if you perform the bulk operation on a single collection.
filter
: The query filter that specifies the criteria used to match the document to replace.
replacement
: The document that includes the new fields and values you want to store in the matching document.
The following example creates an instance of ReplaceOne
:
operation = ReplaceOne( namespace="sample_restaurants.restaurants", filter={ "restaurant_id": "1234" }, replacement={ "name": "Mongo's Pizza", "cuisine": "Pizza", "borough": "Brooklyn", "restaurant_id": "5678" })
You can also create an instance of ReplaceOne
by passing an instance of a custom class to the constructor. This provides additional type safety if you're using a type-checking tool. The instance you pass must inherit from the TypedDict
class.
The TypedDict class is in the typing
module, which is available only in Python 3.8 and later. To use the TypedDict
class in earlier versions of Python, install the typing_extensions package.
The following example constructs a ReplaceOne
instance by using a custom class for added type safety:
class Restaurant (TypedDict): name: str cuisine: str borough: str restaurant_id: stroperation = pymongo.ReplaceOne( { "restaurant_id": "1234" }, Restaurant(name="Mongo's Pizza", cuisine="Pizza", borough="Brooklyn", restaurant_id="5678"))
To replace multiple documents, you must create an instance of ReplaceOne
for each document.
To learn more about type-checking tools available for Python, see Type Checkers on the Tools page.
To delete a document, create an instance of DeleteOne
and pass in the following arguments:
namespace
: The namespace in which to delete the document. This argument is optional if you perform the bulk operation on a single collection.
filter
: The query filter that specifies the criteria used to match the document to delete.
DeleteOne
removes only the first document that matches your query filter.
The following example creates an instance of DeleteOne
:
operation = DeleteOne( namespace="sample_restaurants.restaurants", filter={ "restaurant_id": "5678" })
To delete multiple documents, create an instance of DeleteMany
and pass in a namespace and query filter specifying the document you want to delete. DeleteMany
removes all documents that match your query filter.
The following example creates an instance of DeleteMany
:
operation = DeleteMany( namespace="sample_restaurants.restaurants", filter={ "name": "Mongo's Deli" })
After you define a class instance for each operation you want to perform, pass a list of these instances to the bulk_write()
method. Call the bulk_write()
method on a Collection
instance to write to a single collection or a MongoClient
instance to write to multiple namespaces.
If any of the write operations called on a Collection
fail, PyMongo raises a BulkWriteError
and does not perform any further operations. BulkWriteError
provides a details
attribute that includes the operation that failed, and details about the exception.
If any of the write operations called on a MongoClient
fail, PyMongo raises a ClientBulkWriteException
and does not perform any further operations. ClientBulkWriteException
provides an error
attribute that includes information about the exception.
When PyMongo runs a bulk operation, it uses the write_concern
of the collection or client on which the operation is running. You can also set a write concern for the operation when using the MongoClient.bulk_write()
method. The driver reports all write concern errors after attempting all operations, regardless of execution order.
To learn more about write concerns, see Write Concern in the MongoDB Server manual.
The following example performs multiple write operations on the restaurants
collection by using the bulk_write()
method on a Collection
instance. Select the Synchronous or Asynchronous tab to see the corresponding code:
operations = [ InsertOne( document={ "name": "Mongo's Deli", "cuisine": "Sandwiches", "borough": "Manhattan", "restaurant_id": "1234" } ), InsertOne( document={ "name": "Mongo's Deli", "cuisine": "Sandwiches", "borough": "Brooklyn", "restaurant_id": "5678" } ), UpdateMany( filter={ "name": "Mongo's Deli" }, update={ "$set": { "cuisine": "Sandwiches and Salads" }} ), DeleteOne( filter={ "restaurant_id": "1234" } )]results = restaurants.bulk_write(operations)print(results)
BulkWriteResult({'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 2,'nUpserted': 0, 'nMatched': 2, 'nModified': 2, 'nRemoved': 1, 'upserted': []},acknowledged=True)
operations = [ InsertOne( document={ "name": "Mongo's Deli", "cuisine": "Sandwiches", "borough": "Manhattan", "restaurant_id": "1234" } ), InsertOne( document={ "name": "Mongo's Deli", "cuisine": "Sandwiches", "borough": "Brooklyn", "restaurant_id": "5678" } ), UpdateMany( filter={ "name": "Mongo's Deli" }, update={ "$set": { "cuisine": "Sandwiches and Salads" }} ), DeleteOne( filter={ "restaurant_id": "1234" } )]results = await restaurants.bulk_write(operations)print(results)
BulkWriteResult({'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 2,'nUpserted': 0, 'nMatched': 2, 'nModified': 2, 'nRemoved': 1, 'upserted': []},acknowledged=True)
The following example performs multiple write operations on the sample_restaurants.restaurants
and sample_mflix.movies
namespaces by using the bulk_write()
method on a MongoClient
instance. Select the Synchronous or Asynchronous tab to see the corresponding code:
operations = [ InsertOne( namespace="sample_mflix.movies", document={ "title": "Minari", "runtime": 217, "genres": ["Drama", "Comedy"] } ), UpdateOne( namespace="sample_mflix.movies", filter={ "title": "Minari" }, update={ "$set": { "runtime": 117 }} ), DeleteMany( namespace="sample_restaurants.restaurants", filter={ "cuisine": "French" } )]results = client.bulk_write(operations)print(results)
ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1,'nModified': 1, 'nDeleted': 344, 'insertResults': {}, 'updateResults': {},'deleteResults': {}}, acknowledged=True, verbose=False)
operations = [ InsertOne( namespace="sample_mflix.movies", document={ "title": "Minari", "runtime": 217, "genres": ["Drama", "Comedy"] } ), UpdateOne( namespace="sample_mflix.movies", filter={ "title": "Minari" }, update={ "$set": { "runtime": 117 }} ), DeleteMany( namespace="sample_restaurants.restaurants", filter={ "cuisine": "French" } )]results = await client.bulk_write(operations)print(results)
ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1,'nModified': 1, 'nDeleted': 344, 'insertResults': {}, 'updateResults': {},'deleteResults': {}}, acknowledged=True, verbose=False)
The bulk_write()
method optionally accepts additional parameters, which represent options you can use to configure the bulk write operation.
The following table describes the options you can pass to the Collection.bulk_write()
method:
Property
Description
ordered
If True
, the driver performs the write operations in the order provided. If an error occurs, the remaining operations are not attempted.
If False
, the driver performs the operations in an arbitrary order and attempts to perform all operations.
Defaults to True
.
bypass_document_validation
Specifies whether the operation bypasses document-level validation. For more information, see
Schema Validationin the MongoDB Server manual.
Defaults to False
.
session
An instance of
ClientSession
. For more information, see the
API documentation.
comment
A comment to attach to the operation. For more information, see the
delete command fieldsguide in the MongoDB Server manual.
let
A map of parameter names and values. Values must be constant or closed expressions that don't reference document fields. For more information, see the
let statementin the MongoDB Server manual.
The following example calls the bulk_write()
method from the preceding Collection Bulk Write Example but sets the ordered
option to False
. Select the Synchronous or Asynchronous tab to see the corresponding code:
results = restaurants.bulk_write(operations, ordered=False)
results = await restaurants.bulk_write(operations, ordered=False)
If any of the write operations in an unordered bulk write fail, PyMongo reports the errors only after attempting all operations.
NoteUnordered bulk operations do not guarantee order of execution. The order can differ from the way you list them to optimize the runtime.
The following table describes the options you can pass to the MongoClient.bulk_write()
method:
Property
Description
session
An instance of
ClientSession
. For more information, see the
API documentation.
ordered
If True
, the driver performs the write operations in the order provided. If an error occurs, the remaining operations are not attempted.
If False
, the driver performs the operations in an arbitrary order and attempts to perform all operations.
Defaults to True
.
verbose_results
Specifies whether the operation returns detailed results for each successful operation.
Defaults to False
.
bypass_document_validation
Specifies whether the operation bypasses document-level validation. For more information, see
Schema Validationin the MongoDB Server manual.
Defaults to False
.
comment
A comment to attach to the operation. For more information, see the
delete command fieldsguide in the MongoDB Server manual.
let
A map of parameter names and values. Values must be constant or closed expressions that don't reference document fields. For more information, see the
let statementin the MongoDB Server manual.
write_concern
Specifies the write concern to use for the bulk operation. For more information, see
Write Concernin the MongoDB Server manual.
The following example calls the bulk_write()
method from the preceding Client Bulk Write Example but sets the verbose_results
option to True
. Select the Synchronous or Asynchronous tab to see the corresponding code:
results = client.bulk_write(operations, verbose_results=True)
ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1, 'nModified': 1,'nDeleted': 344, 'insertResults': {0: InsertOneResult(ObjectId('...'),acknowledged=True)}, 'updateResults': {1: UpdateResult({'ok': 1.0, 'idx': 1, 'n': 1,'nModified': 1}, acknowledged=True)}, 'deleteResults': {2: DeleteResult({'ok': 1.0,'idx': 2, 'n': 344}, acknowledged=True)}}, acknowledged=True, verbose=True)
results = await client.bulk_write(operations, verbose_results=True)
ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1, 'nModified': 1,'nDeleted': 344, 'insertResults': {0: InsertOneResult(ObjectId('...'),acknowledged=True)}, 'updateResults': {1: UpdateResult({'ok': 1.0, 'idx': 1, 'n': 1,'nModified': 1}, acknowledged=True)}, 'deleteResults': {2: DeleteResult({'ok': 1.0,'idx': 2, 'n': 344}, acknowledged=True)}}, acknowledged=True, verbose=True)
This section describes the return value of the following bulk operation methods:
The Collection.bulk_write()
method returns a BulkWriteResult
object. The BulkWriteResult
object contains the following properties:
Property
Description
acknowledged
Indicates if the server acknowledged the write operation.
bulk_api_result
The raw bulk API result returned by the server.
deleted_count
The number of documents deleted, if any.
inserted_count
The number of documents inserted, if any.
matched_count
The number of documents matched for an update, if applicable.
modified_count
The number of documents modified, if any.
upserted_count
The number of documents upserted, if any.
upserted_ids
A map of the operation's index to the _id
of the upserted documents, if applicable.
The MongoClient.bulk_write()
method returns a ClientBulkWriteResult
object. The ClientBulkWriteResult
object contains the following properties:
Property
Description
acknowledged
Indicates if the server acknowledged the write operation.
bulk_api_result
The raw bulk API result returned by the server.
delete_results
A map of any successful delete operations and their results.
deleted_count
The number of documents deleted, if any.
has_verbose_results
Indicates whether the returned results are verbose.
insert_results
A map of any successful insert operations and their results.
inserted_count
The number of documents inserted, if any.
matched_count
The number of documents matched for an update, if applicable.
modified_count
The number of documents modified, if any.
update_results
A map of any successful update operations and their results.
upserted_count
The number of documents upserted, if any.
If you don't add a type annotation for your MongoClient
object, your type checker might show an error similar to the following:
from pymongo import MongoClientclient = MongoClient()
The solution is to annotate the MongoClient
object as client: MongoClient
or client: MongoClient[Dict[str, Any]]
.
If you specify MongoClient
as a type hint but don't include data types for the document, keys, and values, your type checker might show an error similar to the following:
error: Dict entry 0 has incompatible type "str": "int";expected "Mapping[str, Any]": "int"
The solution is to add the following type hint to your MongoClient
object:
``client: MongoClient[Dict[str, Any]]``
To learn how to perform individual write operations, see the following guides:
To learn more about any of the methods or types discussed in this guide, see the following API Documentation:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4