This guide explains how to use PyMongo to encode and decode custom types.
You might need to define a custom type if you want to store a data type that the driver doesn't understand. For example, the BSON library's Decimal128
type is distinct from Python's built-in Decimal
type. Attempting to save an instance of Decimal
with PyMongo results in an InvalidDocument
exception, as shown in the following code example. Select the Synchronous or Asynchronous tab to see the corresponding code:
from decimal import Decimalnum = Decimal("45.321")db["coll"].insert_one({"num": num})
Traceback (most recent call last):...bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'), oftype: <class 'decimal.Decimal'>
from decimal import Decimalnum = Decimal("45.321")await db["coll"].insert_one({"num": num})
Traceback (most recent call last):...bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'), oftype: <class 'decimal.Decimal'>
The following sections show how to define a custom type for this Decimal
type.
To encode a custom type, you must first define a type codec. A type codec describes how an instance of a custom type is converted to and from a type that the bson
module can already encode.
When you define a type codec, your class must inherit from one of the base classes in the codec_options
module. The following table describes these base classes, and when and how to implement them:
Base Class
When to Use
Members to Implement
codec_options.TypeEncoder
Inherit from this class to define a codec that encodes a custom Python type to a known BSON type.
python_type
attribute: The custom Python type that's encoded by this type codec
transform_python()
method: Function that transforms a custom type value into a type that BSON can encode
codec_options.TypeDecoder
Inherit from this class to define a codec that decodes a specified BSON type into a custom Python type.
bson_type
attribute: The BSON type that's decoded by this type codec
transform_bson()
method: Function that transforms a standard BSON type value into the custom type
codec_options.TypeCodec
Inherit from this class to define a codec that can both encode and decode a custom type.
python_type
attribute: The custom Python type that's encoded by this type codec
bson_type
attribute: The BSON type that's decoded by this type codec
transform_bson()
method: Function that transforms a standard BSON type value into the custom type
transform_python()
method: Function that transforms a custom type value into a type that BSON can encode
Because the example Decimal
custom type can be converted to and from a Decimal128
instance, you must define how to encode and decode this type. Therefore, the Decimal
type codec class must inherit from the TypeCodec
base class:
from bson.decimal128 import Decimal128from bson.codec_options import TypeCodecclass DecimalCodec(TypeCodec): python_type = Decimal bson_type = Decimal128 def transform_python(self, value): return Decimal128(value) def transform_bson(self, value): return value.to_decimal()
After defining a custom type codec, you must add it to PyMongo's type registry, the list of types the driver can encode and decode. To do so, create an instance of the TypeRegistry
class, passing in an instance of your type codec class inside a list. If you create multiple custom codecs, you can pass them all to the TypeRegistry
constructor.
The following code examples adds an instance of the DecimalCodec
type codec to the type registry:
from bson.codec_options import TypeRegistrydecimal_codec = DecimalCodec()type_registry = TypeRegistry([decimal_codec])
Note
Once instantiated, registries are immutable and the only way to add codecs to a registry is to create a new one.
Finally, define a codec_options.CodecOptions
instance, passing your TypeRegistry
object as a keyword argument. Pass your CodecOptions
object to the get_collection()
method to obtain a collection that can use your custom type:
from bson.codec_options import CodecOptionscodec_options = CodecOptions(type_registry=type_registry)collection = db.get_collection("test", codec_options=codec_options)
You can then encode and decode instances of the Decimal
class. Select the Synchronous or Asynchronous tab to see the corresponding code:
import pprintcollection.insert_one({"num": Decimal("45.321")})my_doc = collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
import pprintawait collection.insert_one({"num": Decimal("45.321")})my_doc = await collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
To see how MongoDB stores an instance of the custom type, create a new collection object without the customized codec options, then use it to retrieve the document containing the custom type. The following example shows that PyMongo stores an instance of the Decimal
class as a Decimal128
value. Select the Synchronous or Asynchronous tab to see the corresponding code:
import pprintnew_collection = db.get_collection("test")pprint.pprint(new_collection.find_one())
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
import pprintnew_collection = db.get_collection("test")pprint.pprint(await new_collection.find_one())
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
You might also need to encode one or more types that inherit from your custom type. Consider the following subtype of the Decimal
class, which contains a method to return its value as an integer:
class DecimalInt(Decimal): def get_int(self): return int(self)
If you try to save an instance of the DecimalInt
class without first registering a type codec for it, PyMongo raises an error. Select the Synchronous or Asynchronous tab to see the corresponding code:
collection.insert_one({"num": DecimalInt("45.321")})
Traceback (most recent call last):...bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'),of type: <class 'decimal.Decimal'>
await collection.insert_one({"num": DecimalInt("45.321")})
Traceback (most recent call last):...bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'),of type: <class 'decimal.Decimal'>
To encode an instance of the DecimalInt
class, you must define a type codec for the class. This type codec must inherit from the parent class's codec, DecimalCodec
, as shown in the following example:
class DecimalIntCodec(DecimalCodec): @property def python_type(self): return DecimalInt
You can then add the subclass's type codec to the type registry and encode instances of the custom type. Select the Synchronous or Asynchronous tab to see the corresponding code:
import pprintfrom bson.codec_options import CodecOptionsdecimal_int_codec = DecimalIntCodec()type_registry = TypeRegistry([decimal_codec, decimal_int_codec])codec_options = CodecOptions(type_registry=type_registry)collection = db.get_collection("test", codec_options=codec_options)collection.insert_one({"num": DecimalInt("45.321")})my_doc = collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
import pprintfrom bson.codec_options import CodecOptionsdecimal_int_codec = DecimalIntCodec()type_registry = TypeRegistry([decimal_codec, decimal_int_codec])codec_options = CodecOptions(type_registry=type_registry)collection = db.get_collection("test", codec_options=codec_options)await collection.insert_one({"num": DecimalInt("45.321")})my_doc = await collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
Note
The transform_bson()
method of the DecimalCodec
class results in these values being decoded as Decimal
, not DecimalInt
.
You can also register a fallback encoder, a callable to encode types not recognized by BSON and for which no type codec has been registered. The fallback encoder accepts an unencodable value as a parameter and returns a BSON-encodable value.
The following fallback encoder encodes Python's Decimal
type to a Decimal128
:
def fallback_encoder(value): if isinstance(value, Decimal): return Decimal128(value) return value
After declaring a fallback encoder, perform the following steps:
Construct a new instance of the TypeRegistry
class. Use the fallback_encoder
keyword argument to pass in the fallback encoder.
Construct a new instance of the CodecOptions
class. Use the type_registry
keyword argument to pass in the TypeRegistry
instance.
Call the get_collection()
method. Use the codec_options
keyword argument to pass in the CodecOptions
instance.
The following code example shows this process:
type_registry = TypeRegistry(fallback_encoder=fallback_encoder)codec_options = CodecOptions(type_registry=type_registry)collection = db.get_collection("test", codec_options=codec_options)
You can then use this reference to a collection to store instances of the Decimal
class. Select the Synchronous or Asynchronous tab to see the corresponding code:
import pprintcollection.insert_one({"num": Decimal("45.321")})my_doc = collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
import pprintawait collection.insert_one({"num": Decimal("45.321")})my_doc = await collection.find_one()pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
Note
Fallback encoders are invoked after attempts to encode the given value with standard BSON encoders and any configured type encoders have failed. Therefore, in a type registry configured with a type encoder and fallback encoder that both target the same custom type, the behavior specified in the type encoder takes precedence.
Because fallback encoders don't need to declare the types that they encode beforehand, you can use them in cases where a TypeEncoder
doesn't work. For example, you can use a fallback encoder to save arbitrary objects to MongoDB. Consider the following arbitrary custom types:
class MyStringType(object): def __init__(self, value): self.__value = value def __repr__(self): return "MyStringType('%s')" % (self.__value,)class MyNumberType(object): def __init__(self, value): self.__value = value def __repr__(self): return "MyNumberType(%s)" % (self.__value,)
You can define a fallback encoder that pickles the objects it receives and returns them as Binary
instances with a custom subtype. This custom subtype allows you to define a TypeDecoder
class that identifies pickled artifacts upon retrieval and transparently decodes them back into Python objects:
import picklefrom bson.binary import Binary, USER_DEFINED_SUBTYPEfrom bson.codec_options import TypeDecoderdef fallback_pickle_encoder(value): return Binary(pickle.dumps(value), USER_DEFINED_SUBTYPE)class PickledBinaryDecoder(TypeDecoder): bson_type = Binary def transform_bson(self, value): if value.subtype == USER_DEFINED_SUBTYPE: return pickle.loads(value) return value
You can then add the PickledBinaryDecoder
to the codec options for a collection and use it to encode and decode your custom types. Select the Synchronous or Asynchronous tab to see the corresponding code:
from bson.codec_options import CodecOptions,TypeRegistrycodec_options = CodecOptions( type_registry=TypeRegistry( [PickledBinaryDecoder()], fallback_encoder=fallback_pickle_encoder))collection = db.get_collection("test", codec_options=codec_options)collection.insert_one( {"_id": 1, "str": MyStringType("hello world"), "num": MyNumberType(2)})my_doc = collection.find_one()print(isinstance(my_doc["str"], MyStringType))print(isinstance(my_doc["num"], MyNumberType))
from bson.codec_options import CodecOptions,TypeRegistrycodec_options = CodecOptions( type_registry=TypeRegistry( [PickledBinaryDecoder()], fallback_encoder=fallback_pickle_encoder))collection = db.get_collection("test", codec_options=codec_options)await collection.insert_one( {"_id": 1, "str": MyStringType("hello world"), "num": MyNumberType(2)})my_doc = await collection.find_one()print(isinstance(my_doc["str"], MyStringType))print(isinstance(my_doc["num"], MyNumberType))
PyMongo type codecs and fallback encoders have the following limitations:
You can't customize the encoding behavior of Python types that PyMongo already understands, like int
and str
. If you try to instantiate a type registry with one or more codecs that act upon a built-in type, PyMongo raises a TypeError
. This limitation also applies to all subtypes of the standard types.
You can't chain type encoders. A custom type value, once transformed by a codec's transform_python()
method, must result in a type that is either BSON-encodable by default, or can be transformed by the fallback encoder into something BSON-encodable. It cannot be transformed a second time by a different type codec.
The Database.command()
method doesn't apply custom type decoders while decoding the command response document.
The gridfs
class doesn't apply custom type encoding or decoding to any documents it receives or returns.
For more information about encoding and decoding custom types, see the following API documentation:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4