RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/python/typing/issues/258 below:

syntax for variable and attribute annotations (PEP 526) · Issue #258 · python/typing · GitHub

Introduction

This issue is reserved for substantive work on PEP 526, "Syntax for Variable and Attribute Annotations". For textual nits please comment directly on the latest PR for this PEP in the peps repo.

I sent a strawman proposal to python-ideas. The feedback was mixed but useful -- people tried to poke holes in it from many angles.

In this issue I want to arrive at a more solid specification. I'm out of time right now, but here are some notes:

Class variables vs. instance variables
Specify instance variables in class body vs. in __init__ or __new__
Thinking with your runtime hat on vs. your type checking hat
Importance of a: <type> vs. how it strikes people the wrong way
Tuple unpacking is a mess, let's avoid it entirely
Collecting the types in something similar to __annotations__
Cost of doing that for locals
Cost of introducing a new keywords

Work in progress here!

I'm updating the issue description to avoid spamming subscribers to this tracker. I'll keep doing this until we have reasonable discussion.

Basic proposal

My basic observation is that introducing a new keyword has two downsides: (a) choice of a good keyword is hard (e.g. it can't be 'var' because that is way too common a variable name, and it can't be 'local' if we want to use it for class variables or globals,) and (b) no matter what we choose, we'll still need a __future__ import.

So I'm proposing something keyword-free:

a: List[int] = []
b: Optional[int] = None

The idea is that this is pretty easy to explain to someone who's already familiar with function annotations.

Multiple types/variables

An obvious question is whether to allow combining type declarations with tuple unpacking (e.g. a, b, c = x). This leads to (real or perceived) ambiguity, and I propose not to support this. If there's a type annotation there can only be one variable to its left, and one value to its right. This still allows tuple packing (just put the tuple in parentheses) but it disallows tuple unpacking. (It's been proposed to allow multiple parenthesized variable names, or types inside parentheses, but none of these look attractive to me.)

There's a similar question about what to about the type of a = b = c = x. My answer to this is the same: Let's not go there; if you want to add a type you have to split it up.

Omitting the initial value

My next step is to observe that sometimes it's convenient to decouple the type declaration from the initialization. One example is a variable that is initialized in each branch of a big sequence of if/elif/etc. blocks, where you want to declare its type before entering the first if, and there's no convenient initial value (e.g. None is not valid because the type is not Optional[...]). So I propose to allow leaving out the assignment:

log: Logger
if develop_mode():
    log = heavy_logger()
elif production_mode():
    log = fatal_only_logger()
else:
    log = default_logger()
log.info("Server starting up...")

The line log: Logger looks a little odd at first but I believe you can get used to it easily. Also, it is again similar to what you can do in function annotations. (However, don't hyper-generalize. A line containing just log by itself means something different -- it's probably a NameError.)

Note that this is something that you currently can't do with # type comments -- you currently have to put the type on the (lexically) first assignment, like this:

if develop_mode():
    log = heavy_logger()  # type: Logger
elif production_mode():
    log = fatal_only_logger()  # (No type declaration here!)
# etc.

(In this particular example, a type declaration may be needed because heavy_logger() returns a subclass of Logger, while other branches produce different subclasses; in general the type checker shouldn't just compute the common superclass because then a type error would just infer the type object.)

What about runtime

Suppose we have a: int -- what should this do at runtime? Is it ignored, or does it initialize a to None, or should we perhaps introduce something new like JavaScript's undefined? I feel quite strongly that it should leave a uninitialized, just as if the line was not there at all.

Instance variables and class variables

Based on working with mypy since last December I feel strongly that it's very useful to be able to declare the types of instance variables in class bodies. In fact this is one place where I find the value-less notation (a: int) particularly useful, to declare instance variables that should always be initialized by __init__ (or __new__), e.g. variables whose type is mutable or cannot be None.

We still need a way to declare class variables, and here I propose some new syntax, prefixing the type with a class keyword:

class Starship:
    captain: str                      # instance variable without default
    damage: int = 0                   # instance variable with default (stored in class)
    stats: class Dict[str, int] = {}  # class variable with initialization

I do have to admit that this is entirely unproven. PEP 484 and mypy currently don't have a way to distinguish between instance and class variables, and it hasn't been a big problem (though I think I've seen a few mypy bug reports related to mypy's inability to tell the difference).

Capturing the declared types at runtime

For function annotations, the types are captured in the function's __annotations__ object. It would be an obvious extension of this idea to do the same thing for variable declarations. But where exactly would we store this info? A strawman proposal is to introduce __annotations__ dictionaries at various levels. At each level, the types would go into the __annotations__ dict at that same level. Examples:

Global variables

players: Dict[str, Player]
print(__annotations__)

This would print {'players': Dict[str, Player]} (where the value is the runtime representation of the type Dict[str, Player]).

Class and instance variables:

class Starship:
    # Class variables
    hitpoints: class int = 50
    stats: class Dict[str, int] = {}
    # Instance variables
    damage: int = 0
    shield: int = 100
    captain: str  # no initial value
print(Starship.__annotations__)

This would print a dict with five keys, and corresponding values:

{'hitpoints': ClassVar[int],  # I'm making this up as a runtime representation of "class int"
 'stats': ClassVar[Dict[str, int]],
 'damage': int,
 'shield': int,
 'captain': str
}

Finally, locals. Here I think we should not store the types -- the value of having the annotations available locally is just not enough to offset the cost of creating and populating the dictionary on each function call.

In fact, I don't even think that the type expression should be evaluated during the function execution. So for example:

def side_effect():
    print("Hello world")
def foo():
    a: side_effect()
    a = 12
    return a
foo()

should not print anything. (A type checker would also complain that side_effect() is not a valid type.)

This is inconsistent with the behavior of

def foo(a: side_effect()):
    a = 12
    return a

which does print something (at function definition time). But there's a limit to how much consistency I am prepared to propose. (OTOH for globals and class/instance variables I think that there would be some cool use cases for having the information available.)

Effect of presence of a: <type>

The presence of a local variable declaration without initialization still has an effect: it ensures that the variable is considered to be a local variable, and it is given a "slot" as if it was assigned to. So, for example:

def foo():
    a: int
    print(a)
a = 42
foo()

will raise UnboundLocalError, not NameError. It's the same as if the code had read

def foo():
    if False: a = 0
    print(a)

Instance variables inside methods

Mypy currently supports # type comments on assignments to instance variables (and other things). At least for __init__ (and __new__, and functions called from either) this seems useful, in case you prefer a style where instance variables are declared in __init__ (etc.) rather than in the class body.

I'd like to support this, at least for cases that obviously refer to instance variables of self. In this case we should probably not update __annotations__.

What about global or nonlocal?

We should not change global and nonlocal. The reason is that those don't declare new variables, they declare that an existing variable is write-accessible in the current scope. Their type belongs in the scope where they are defined.

Redundant declarations

I propose that the Python compiler should ignore duplicate declarations of the same variable in the same scope. It should also not bother to validate the type expression (other than evaluating it when not in a local scope). It's up to the type checker to complain about this. The following nonsensical snippet should be allowed at runtime:

a: 2+2
b: int = 'hello'
if b:
    b: str
    a: str

harvimt, ivoflipse, jstasiak, alexprengere, sleibrock and 18 moreXion, anxolerd and dfdeshom

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4