Le 24/08/2011 04:41, Torsten Becker a écrit : > On Tue, Aug 23, 2011 at 18:27, Victor Stinner > <victor.stinner at haypocalc.com> wrote: >> I posted a patch to re-add it: >> http://bugs.python.org/issue12819#msg142867 > > Thank you for the patch! Note that this patch adds the fast path only > to the helper function which determines the length of the string and > the maximum character. The decoding part is still without a fast path > for ASCII runs. Ah? If utf8_max_char_size_and_has_errors() returns no error hand maxchar=127: memcpy() is used. You mean that memcpy() is too slow? :-) maxchar = utf8_max_char_size_and_has_errors(s, size, &unicode_size, &has_errors); if (has_errors) { ... } else { unicode = (PyUnicodeObject *)PyUnicode_New(unicode_size, maxchar); if (!unicode) return NULL; /* When the string is ASCII only, just use memcpy and return. */ if (maxchar < 128) { assert(unicode_size == size); Py_MEMCPY(PyUnicode_1BYTE_DATA(unicode), s, unicode_size); return (PyObject *)unicode; } ... } But yes, my patch only optimize ASCII only strings, not "mostly-ASCII" strings (e.g. 100 ASCII + 1 latin1 character). It can be optimized later. I didn't benchmark my patch. Victor
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4