Martin v. Löwis wrote: > Shane Hathaway wrote: > >>I agree that UCS4 is needed. There is a balancing act here; UTF-16 is >>widely used and takes less space, while UCS4 is easier to treat as an >>array of characters. Maybe we can have both: unicode objects start with >>an internal representation in UTF-16, but get promoted automatically to >>UCS4 when you index or slice them. The difference will not be visible >>to Python code. A compile-time switch will not be necessary. What do >>you think? > > > This breaks backwards compatibility with existing extension modules. > Applications that do PyUnicode_AsUnicode get a Py_UNICODE*, and > can use that to directly access the characters. Py_UNICODE would always be 32 bits wide. PyUnicode_AsUnicode would cause the unicode object to be promoted automatically. Extensions that break as a result are technically broken already, aren't they? They're not supposed to depend on the size of Py_UNICODE. Shane
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4