Stefan Behnel, 25.08.2011 23:30: > Stefan Behnel, 25.08.2011 20:47: >> "Martin v. Löwis", 24.08.2011 20:15: >>> - issues to be considered (unclarities, bugs, limitations, ...) >> >> A problem of the current implementation is the need for calling >> PyUnicode_(FAST_)READY(), and the fact that it can fail (e.g. due to >> insufficient memory). Basically, this means that even something as trivial >> as trying to get the length of a Unicode string can now result in an error. > > Oh, and the same applies to PyUnicode_AS_UNICODE() now. I doubt that there > is *any* code out there that expects this macro to ever return NULL. This > means that the current implementation has actually broken the old API. Just > allocate an "80% of your memory" long string using the new API and then > call PyUnicode_AS_UNICODE() on it to see what I mean. > > Sadly, a quick look at a couple of recent commits in the pep-393 branch > suggested that it is not even always obvious to you as the authors which > macros can be called safely and which cannot. I immediately spotted a bug > in one of the updated core functions (unicode_repr, IIRC) where > PyUnicode_GET_LENGTH() is called without a previous call to > PyUnicode_FAST_READY(). > > I find it everything but obvious that calling PyUnicode_DATA() and > PyUnicode_KIND() is safe as long as the return value is being checked for > errors, but calling PyUnicode_GET_LENGTH() is not safe unless there was a > previous call to PyUnicode_Ready(). And, adding to my own mail yet another time, the current header file states this: """ /* String contains only wstr byte characters. This is only possible when the string was created with a legacy API and PyUnicode_Ready() has not been called yet. Note that PyUnicode_KIND() calls PyUnicode_FAST_READY() so PyUnicode_WCHAR_KIND is only possible as a intialized value not as a result of PyUnicode_KIND(). */ #define PyUnicode_WCHAR_KIND 0 """ From my understanding, this is incorrect. When I call PyUnicode_KIND() on an old style object and it fails to allocate the string buffer, I would expect that I actually get PyUnicode_WCHAR_KIND back as a result, as the SSTATE_KIND_* value in the "state" field has not been initialised yet at that point. Stefan
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4