Hi, PEP 393 [1] deprecates some Unicode APIs relating to Py_UNICODE. The PEP doesn't provide schedule for removing them. But the APIs are marked "will be removed in 4.0" in the document. When removing them, we can reduce `wchar_t *` member of unicode object. It takes 8 bytes on 64bit platform. [1]: "Flexible String Representation" https://www.python.org/dev/peps/pep-0393/ I thought Python 4.0 is the next version of 3.9. But Guido has different idea. He said following at Zulip chat (we're trying it for now). > No, 4.0 is not just what comes after 3.9 -- the major number change would indicate some kind of major change somewhere (like possibly the Gilectomy, which changes a lot of the C APIs). If we have more than 10 3.x versions, we'll just live with 3.10, 3.11 etc. And he said about these APIs: >> Unicode objects has some "Deprecated since version 3.3, will be removed in version 4.0" APIs (pep-393). >> When removing them, we can reduce PyUnicode size about 8~12byte. > > We should be able to deprecate these sooner by updating the docs. Then, I want to reschedule the removal of these APIs. Can we remove them in 3.8? 3.9? or 3.10? I prefer sooner as possible. --- Slightly off topic, there are 4bytes alignment gap in the unicode object, on 64bit platform. typedef struct { .... struct { unsigned int interned:2; unsigned int kind:3; unsigned int compact:1; unsigned int ascii:1; unsigned int ready:1; unsigned int :24; } state; // 4 bytes // implicit 4 bytes gap here. wchar_t *wstr; // 8 bytes } PyASCIIObject; So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr. Or we can reduce 4 bytes soon by moving `wstr` before `state`. Off course, it needs siphash support 4byte aligned data instead of 8byte. Regards, -- INADA Naoki <songofacandy at gmail.com>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4