A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://peps.python.org/pep-0681/ below:

PEP 681 – Data Class Transforms

PEP 681 – Data Class Transforms
Author:
Erik De Bonte <erikd at microsoft.com>, Eric Traut <erictr at microsoft.com>
Sponsor:
Jelle Zijlstra <jelle.zijlstra at gmail.com>
Discussions-To:
Typing-SIG thread
Status:
Final
Type:
Standards Track
Topic:
Typing
Created:
02-Dec-2021
Python-Version:
3.11
Post-History:
24-Apr-2021, 13-Dec-2021, 22-Feb-2022
Resolution:
Python-Dev message
Table of Contents Abstract

PEP 557 introduced the dataclass to the Python stdlib. Several popular libraries have behaviors that are similar to dataclasses, but these behaviors cannot be described using standard type annotations. Such projects include attrs, pydantic, and object relational mapper (ORM) packages such as SQLAlchemy and Django.

Most type checkers, linters and language servers have full support for dataclasses. This proposal aims to generalize this functionality and provide a way for third-party libraries to indicate that certain decorator functions, classes, and metaclasses provide behaviors similar to dataclasses.

These behaviors include:

The full behavior of the stdlib dataclass is described in the Python documentation.

This proposal does not affect CPython directly except for the addition of a dataclass_transform decorator in typing.py.

Motivation

There is no existing, standard way for libraries with dataclass-like semantics to declare their behavior to type checkers. To work around this limitation, Mypy custom plugins have been developed for many of these libraries, but these plugins don’t work with other type checkers, linters or language servers. They are also costly to maintain for library authors, and they require that Python developers know about the existence of these plugins and download and configure them within their environment.

Rationale

The intent of this proposal is not to support every feature of every library with dataclass-like semantics, but rather to make it possible to use the most common features of these libraries in a way that is compatible with static type checking. If a user values these libraries and also values static type checking, they may need to avoid using certain features or make small adjustments to the way they use them. That’s already true for the Mypy custom plugins, which don’t support every feature of every dataclass-like library.

As new features are added to dataclasses in the future, we intend, when appropriate, to add support for those features on dataclass_transform as well. Keeping these two feature sets in sync will make it easier for dataclass users to understand and use dataclass_transform and will simplify the maintenance of dataclass support in type checkers.

Additionally, we will consider adding dataclass_transform support in the future for features that have been adopted by multiple third-party libraries but are not supported by dataclasses.

Specification The dataclass_transform decorator

This specification introduces a new decorator function in the typing module named dataclass_transform. This decorator can be applied to either a function that is itself a decorator, a class, or a metaclass. The presence of dataclass_transform tells a static type checker that the decorated function, class, or metaclass performs runtime “magic” that transforms a class, endowing it with dataclass-like behaviors.

If dataclass_transform is applied to a function, using the decorated function as a decorator is assumed to apply dataclass-like semantics. If the function has overloads, the dataclass_transform decorator can be applied to the implementation of the function or any one, but not more than one, of the overloads. When applied to an overload, the dataclass_transform decorator still impacts all usage of the function.

If dataclass_transform is applied to a class, dataclass-like semantics will be assumed for any class that directly or indirectly derives from the decorated class or uses the decorated class as a metaclass. Attributes on the decorated class and its base classes are not considered to be fields.

Examples of each approach are shown in the following sections. Each example creates a CustomerModel class with dataclass-like semantics. The implementation of the decorated objects is omitted for brevity, but we assume that they modify classes in the following ways:

Type checkers supporting this PEP will recognize that the CustomerModel class can be instantiated using the synthesized __init__ method:

# Using positional arguments
c1 = CustomerModel(327, "John Smith")

# Using keyword arguments
c2 = CustomerModel(id=327, name="John Smith")

# These calls will generate runtime errors and should be flagged as
# errors by a static type checker.
c3 = CustomerModel()
c4 = CustomerModel(327, first_name="John")
c5 = CustomerModel(327, "John Smith", 0)
Decorator function example
_T = TypeVar("_T")

# The ``create_model`` decorator is defined by a library.
# This could be in a type stub or inline.
@typing.dataclass_transform()
def create_model(cls: Type[_T]) -> Type[_T]:
    cls.__init__ = ...
    cls.__eq__ = ...
    cls.__ne__ = ...
    return cls

# The ``create_model`` decorator can now be used to create new model
# classes, like this:
@create_model
class CustomerModel:
    id: int
    name: str
Class example
# The ``ModelBase`` class is defined by a library. This could be in
# a type stub or inline.
@typing.dataclass_transform()
class ModelBase: ...

# The ``ModelBase`` class can now be used to create new model
# subclasses, like this:
class CustomerModel(ModelBase):
    id: int
    name: str
Metaclass example
# The ``ModelMeta`` metaclass and ``ModelBase`` class are defined by
# a library. This could be in a type stub or inline.
@typing.dataclass_transform()
class ModelMeta(type): ...

class ModelBase(metaclass=ModelMeta): ...

# The ``ModelBase`` class can now be used to create new model
# subclasses, like this:
class CustomerModel(ModelBase):
    id: int
    name: str
Decorator function and class/metaclass parameters

A decorator function, class, or metaclass that provides dataclass-like functionality may accept parameters that modify certain behaviors. This specification defines the following parameters that static type checkers must honor if they are used by a dataclass transform. Each of these parameters accepts a bool argument, and it must be possible for the bool value (True or False) to be statically evaluated.

dataclass_transform parameters

Parameters to dataclass_transform allow for some basic customization of default behaviors:

_T = TypeVar("_T")

def dataclass_transform(
    *,
    eq_default: bool = True,
    order_default: bool = False,
    kw_only_default: bool = False,
    field_specifiers: tuple[type | Callable[..., Any], ...] = (),
    **kwargs: Any,
) -> Callable[[_T], _T]: ...

In the future, we may add additional parameters to dataclass_transform as needed to support common behaviors in user code. These additions will be made after reaching consensus on typing-sig rather than via additional PEPs.

The following sections provide additional examples showing how these parameters are used.

Decorator function example
# Indicate that the ``create_model`` function assumes keyword-only
# parameters for the synthesized ``__init__`` method unless it is
# invoked with ``kw_only=False``. It always synthesizes order-related
# methods and provides no way to override this behavior.
@typing.dataclass_transform(kw_only_default=True, order_default=True)
def create_model(
    *,
    frozen: bool = False,
    kw_only: bool = True,
) -> Callable[[Type[_T]], Type[_T]]: ...

# Example of how this decorator would be used by code that imports
# from this library:
@create_model(frozen=True, kw_only=False)
class CustomerModel:
    id: int
    name: str
Class example
# Indicate that classes that derive from this class default to
# synthesizing comparison methods.
@typing.dataclass_transform(eq_default=True, order_default=True)
class ModelBase:
    def __init_subclass__(
        cls,
        *,
        init: bool = True,
        frozen: bool = False,
        eq: bool = True,
        order: bool = True,
    ):
        ...

# Example of how this class would be used by code that imports
# from this library:
class CustomerModel(
    ModelBase,
    init=False,
    frozen=True,
    eq=False,
    order=False,
):
    id: int
    name: str
Metaclass example
# Indicate that classes that use this metaclass default to
# synthesizing comparison methods.
@typing.dataclass_transform(eq_default=True, order_default=True)
class ModelMeta(type):
    def __new__(
        cls,
        name,
        bases,
        namespace,
        *,
        init: bool = True,
        frozen: bool = False,
        eq: bool = True,
        order: bool = True,
    ):
        ...

class ModelBase(metaclass=ModelMeta):
    ...

# Example of how this class would be used by code that imports
# from this library:
class CustomerModel(
    ModelBase,
    init=False,
    frozen=True,
    eq=False,
    order=False,
):
    id: int
    name: str
Field specifiers

Most libraries that support dataclass-like semantics provide one or more “field specifier” types that allow a class definition to provide additional metadata about each field in the class. This metadata can describe, for example, default values, or indicate whether the field should be included in the synthesized __init__ method.

Field specifiers can be omitted in cases where additional metadata is not required:

@dataclass
class Employee:
    # Field with no specifier
    name: str

    # Field that uses field specifier class instance
    age: Optional[int] = field(default=None, init=False)

    # Field with type annotation and simple initializer to
    # describe default value
    is_paid_hourly: bool = True

    # Not a field (but rather a class variable) because type
    # annotation is not provided.
    office_number = "unassigned"
Field specifier parameters

Libraries that support dataclass-like semantics and support field specifier classes typically use common parameter names to construct these field specifiers. This specification formalizes the names and meanings of the parameters that must be understood for static type checkers. These standardized parameters must be keyword-only.

These parameters are a superset of those supported by dataclasses.field, excluding those that do not have an impact on type checking such as compare and hash.

Field specifier classes are allowed to use other parameters in their constructors, and those parameters can be positional and may use other names.

It is an error to specify more than one of default, default_factory and factory.

This example demonstrates the above:

# Library code (within type stub or inline)
# In this library, passing a resolver means that init must be False,
# and the overload with Literal[False] enforces that.
@overload
def model_field(
        *,
        default: Optional[Any] = ...,
        resolver: Callable[[], Any],
        init: Literal[False] = False,
    ) -> Any: ...

@overload
def model_field(
        *,
        default: Optional[Any] = ...,
        resolver: None = None,
        init: bool = True,
    ) -> Any: ...

@typing.dataclass_transform(
    kw_only_default=True,
    field_specifiers=(model_field, ))
def create_model(
    *,
    init: bool = True,
) -> Callable[[Type[_T]], Type[_T]]: ...

# Code that imports this library:
@create_model(init=False)
class CustomerModel:
    id: int = model_field(resolver=lambda : 0)
    name: str
Runtime behavior

At runtime, the dataclass_transform decorator’s only effect is to set an attribute named __dataclass_transform__ on the decorated function or class to support introspection. The value of the attribute should be a dict mapping the names of the dataclass_transform parameters to their values.

For example:

{
  "eq_default": True,
  "order_default": False,
  "kw_only_default": False,
  "field_specifiers": (),
  "kwargs": {}
}
Dataclass semantics

Except where stated otherwise in this PEP, classes impacted by dataclass_transform, either by inheriting from a class that is decorated with dataclass_transform or by being decorated with a function decorated with dataclass_transform, are assumed to behave like stdlib dataclass.

This includes, but is not limited to, the following semantics:

Undefined behavior

If multiple dataclass_transform decorators are found, either on a single function (including its overloads), a single class, or within a class hierarchy, the resulting behavior is undefined. Library authors should avoid these scenarios.

Reference Implementation

Pyright contains the reference implementation of type checker support for dataclass_transform. Pyright’s dataClasses.ts source file would be a good starting point for understanding the implementation.

The attrs and pydantic libraries are using dataclass_transform and serve as real-world examples of its usage.

Rejected Ideas auto_attribs parameter

The attrs library supports an auto_attribs parameter that indicates whether class members decorated with PEP 526 variable annotations but with no assignment should be treated as data fields.

We considered supporting auto_attribs and a corresponding auto_attribs_default parameter, but decided against this because it is specific to attrs.

Django does not support declaring fields using type annotations only, so Django users who leverage dataclass_transform should be aware that they should always supply assigned values.

cmp parameter

The attrs library supports a bool parameter cmp that is equivalent to setting both eq and order to True. We chose not to support a cmp parameter, since it only applies to attrs. Users can emulate the cmp behaviour by using the eq and order parameter names instead.

Automatic field name aliasing

The attrs library performs automatic aliasing of field names that start with a single underscore, stripping the underscore from the name of the corresponding __init__ parameter.

This proposal omits that behavior since it is specific to attrs. Users can manually alias these fields using the alias parameter.

Alternate field ordering algorithms

The attrs library currently supports two approaches to ordering the fields within a class:

The resulting field orderings can differ in certain diamond-shaped multiple inheritance scenarios.

For simplicity, this proposal does not support any field ordering other than that used by dataclasses.

Fields redeclared in subclasses

The attrs library differs from stdlib dataclasses in how it handles inherited fields that are redeclared in subclasses. The dataclass specification preserves the original order, but attrs defines a new order based on subclasses.

For simplicity, we chose to only support the dataclass behavior. Users of attrs who rely on the attrs-specific ordering will not see the expected order of parameters in the synthesized __init__ method.

Django primary and foreign keys

Django applies additional logic for primary and foreign keys. For example, it automatically adds an id field (and __init__ parameter) if there is no field designated as a primary key.

As this is not broadly applicable to dataclass libraries, this additional logic is not accommodated with this proposal, so users of Django would need to explicitly declare the id field.

Class-wide default values

SQLAlchemy requested that we expose a way to specify that the default value of all fields in the transformed class is None. It is typical that all SQLAlchemy fields are optional, and None indicates that the field is not set.

We chose not to support this feature, since it is specific to SQLAlchemy. Users can manually set default=None on these fields instead.

Descriptor-typed field support

We considered adding a boolean parameter on dataclass_transform to enable better support for fields with descriptor types, which is common in SQLAlchemy. When enabled, the type of each parameter on the synthesized __init__ method corresponding to a descriptor-typed field would be the type of the value parameter to the descriptor’s __set__ method rather than the descriptor type itself. Similarly, when setting the field, the __set__ value type would be expected. And when getting the value of the field, its type would be expected to match the return type of __get__.

This idea was based on the belief that dataclass did not properly support descriptor-typed fields. In fact it does, but type checkers (at least mypy and pyright) did not reflect the runtime behavior which led to our misunderstanding. For more details, see the Pyright bug.

converter field specifier parameter

The attrs library supports a converter field specifier parameter, which is a Callable that is called by the generated __init__ method to convert the supplied value to some other desired value. This is tricky to support since the parameter type in the synthesized __init__ method needs to accept uncovered values, but the resulting field is typed according to the output of the converter.

Some aspects of this issue are detailed in a Pyright discussion.

There may be no good way to support this because there’s not enough information to derive the type of the input parameter. One possible solution would be to add support for a converter field specifier parameter but then use the Any type for the corresponding parameter in the __init__ method.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4