On 09/11/18 15:23, Victor Stinner wrote: > Hi, > > Last week, I opened an issue to propose to add a new %T formatter to > PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() > and PyErr_Format(): > > https://bugs.python.org/issue34595 > > I merged my change, but then Serhiy Storchaka asked if we can add > something to get the "fully qualified name" (FQN) of a type, ex > "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). > I proposed a second pull request to add %t (short) in addition to %T > (FQN). > > But then Petr Viktorin asked me to open a thread on python-dev to get > a wider discussion. So here I am. > > > The rationale for this change is to fix multiple issues: > > * C extensions use Py_TYPE(obj)->tp_name which returns a fully > qualified name for C types, but the name (without the module) for > Python name. Python modules use type(obj).__name__ which always return > the short name. That might be a genuine problem, but I wonder if "%T" is fixing the symptom rather than the cause here. Or is this only an issue for PyUnicode_FromFormat()? > * currently, many C extensions truncate the type name: use "%.80s" > instead of "%s" to format a type name That's an orthogonal issue -- you can change "%.80s" to "%s", and presumably you could use "%.80t" as well. > * "%s" with Py_TYPE(obj)->tp_name is used more than 200 times in the C > code, and I dislike this complex pattern. IMHO "%t" with obj would be > simpler to read, write and maintain. I consider `Py_TYPE(obj)->tp_name` much more understandable than "%t". It's longer to spell out, but it's quite self-documenting. > * I want C extensions and Python modules to have the same behavior: > respect the PEP 399. Petr considers that error messages are not part > of the PEP 399, but the issue is wider than only error messages. The other major use is for __repr__, which AFAIK we also don't guarantee to be stable, so I don't think PEP 399 applies to it. Having the same behavior between C and Python versions of a module is nice, but PEP 399 doesn't prescribe it. There are other differences as well -- for example, `_datetime.datetime` is immutable, and that's OK. If error messages and __repr__s should be consistent between Python and the C accelerator, are you planning to write tests for all the affected modules when switching them to %T/%t? > The main issue is that at the C level, Py_TYPE(obj)->tp_name is > "usually" the fully qualified name for types defined in C, but it's > only the "short" name for types defined in Python. > > For example, if you get the C accelerator "_datetime", > PyTYPE(obj)->tp_name of a datetime.timedelta object gives you > "datetime.timedelta", but if you don't have the accelerator, tp_name > is just "timedelta". > > Another example, this script displays "mytimedelta(0)" if you have the > C accelerator, but "__main__.mytimedelta(0)" if you use the Python > implementation: > --- > import sys > #sys.modules['_datetime'] = None > import datetime > > class mytimedelta(datetime.timedelta): > pass > > print(repr(mytimedelta())) > --- > > So I would like to fix this kind of issue. > > > Type names are mainly used for two purposes: > > * format an error message > * obj.__repr__() > > It's unclear to me if we should use the "short" or the "fully > qualified" name. It should maybe be decided on a case by case basis. > > There is also a 3rd usage: to implement __reduce__, here backward > compatibility matters. > > > Note: The discussion evolved since my first implementation of %T which > just used the not well defined Py_TYPE(obj)->tp_name. > > -- > > Petr asked me why not exposing functions to get these names. For > example, with my second PR (not merged), there are 3 (private) > functions: > > /* type.__name__ */ > const char* _PyType_Name(PyTypeObject *type); > /* type.__qualname__ */ > PyObject* _PyType_QualName(PyTypeObject *type); > * type.__module__ "." type.__qualname__ (but type.__qualname__ for > builtin types) */ > PyObject * _PyType_FullName(PyTypeObject *type); > > My concern here is that each caller has to handler error: > > PyErr_Format(PyExc_TypeError, "must be str, not %.100s", > Py_TYPE(obj)->tp_name); > > would become: > > PyObject *type_name = _PyType_FullName(Py_TYPE(obj)); > if (name == NULL) { /* do something with this error ... */ > PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); > Py_DECREF(name); > > When I report an error, I dislike having to handle *new* errors... I > prefer that the error handling is done inside PyErr_Format() for me, > to reduce the risk of additional bugs. > > -- > > Serhiy also asked if we could expose the same feature at the *Python* > level: provide something to get the fully qualified name of a type. > It's not just f"{type(obj).__module}.{type(obj).__name__}", but you > have to skip the module for builtin types like "str" (not return > "builtins.str"). > > Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for > name and "T" for fully qualfied name. We would only have to modify > type.__format__(). > > I'm not sure if we need to add new formatters to str % args. > > Example of Python code: > > raise TypeError("must be str, not %s" % type(fmt).__name__) > > I'm not sure about Python changes. My first concern was just to avoid > Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and > Python consistent. If the behavior of C extensions change, Python > modules should be adapted as well, to get the same behavior. > > > Note: I reverted my change which added the %T formatter from > PyUnicode_FromFormatV() to clarify the status of this issue. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com >
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4