On 22Oct2018 0928, Victor Stinner wrote: >> Also, I'm >> proposing keeping the 'kind' as UCS-2 when the string is created from >> UCS-2 data that is likely to be used as UCS-2. > > Oh. That's a major change in the PEP 393 design. You would have to > modify many functions in CPython. Currently, the PEP 393 requires that > a string always use the most efficient storage, and many optimizations > and code paths rely on that assumptions. I don't know that it requires that many modifications - those functions already have to handle UCS-2 content anyway (e.g. if I get a path from scandir() that includes a non-ASCII character), and they're only using the assumption of most efficient storage to determine the resulting storage size of a string operation (which I'm proposing should also be UCS-2 when the source strings are UCS-2, since that's the best indicator we have that it'll be used as UCS-2 later, as well as being the current implementation :) ). > I'm against this change. > > Moreover, it's hard to guess how a string will be used later... Agreed. There are some heuristics we can use, but it's definitely only a guess. That's the nature of this problem - guessing that it *won't* be used as UCS-2 later on is also a guess. Cheers, Steve
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4