M.-A. Lemburg writes: > Such a buffer is needed to implement "s" and "s#" argument > parsing. It's a simple requirement to support those two > parsing markers -- there's not much to argue about, really... > unless, of course, you want to give up Unicode object support > for all APIs using these parsers. Perhaps I missed the agreement that these should always receive UTF-8 from Unicode strings. Was this agreed upon, or has it simply not been argued over in favor of other topics? If this has indeed been agreed upon... at least it can be computed on demand rather than at initialization! Perhaps there should be two pointers: one to the UTF-8 buffer and one to a PyObject; if the PyObject is there it's a "old-style" string that's actually providing the buffer. This may or may not be a good idea; there's a lot of memory expense for long Unicode strings converted from UTF-8 that aren't ever converted back to UTF-8 or accessed using "s" or "s#". Ok, I've talked myself out of that. ;-) -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4