Guido van Rossum <guido@python.org> writes: > Would it make sense to change the Unicode object to use pymalloc, and > to change the UTF-8 codec to count the bytes if the shortest possible > output would fit in a pymalloc block? (I guess this means that the > length of the Unicode string should be less than > SMALL_REQUEST_THRESHOLD - currently 256.) Given my measurements, that would make sense. I suspect that counting small strings is quite efficient, so that the overhead of iterating over the string twice hides in the noise of additional invocations. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4