This is a multi-part message in MIME format. ------=_NextPart_000_0071_01BFDADC.B3488D80 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable [as discussed on python-dev] amk wrote: > On Tue, Jun 20, 2000 at 12:15:01PM +0200, Fredrik Lundh wrote: > >for the record, my little benchmark now runs about five > >times faster than before -- which means that 16-bit find > >is now ~30% faster than 8-bit find (barry? ;-) >=20 > Speculation: the code in stringobject.c looks like this: > for (; i <=3D last; ++i) > if (s[i] =3D=3D sub[0] && > (n =3D=3D 1 || memcmp(&s[i+1], &sub[1], = n-1) =3D=3D 0)) > return (long)i; > =20 > It checks the first character, and then does the memcmp() skipping the > first character, or succeeds if the substring length is 1; the Unicode > find is simpler, just doing the full memcmp. I wonder if the extra n > =3D=3D 1 and i+1, n-1 arithmetic costs more than it saves? at least on my box (win95, msvc 5)... after simplifying the code, string.find is now faster than sre.search. it's still 15% slower than the 16-bit string find, but I don't have time to dig into that right now... patches are included. =20 </F> I confirm that, to the best of my knowledge and belief, this contribution is free of any claims of third parties under copyright, patent or other rights or interests ("claims"). To the extent that I have any such claims, I hereby grant to CNRI a nonexclusive, irrevocable, royalty-free, worldwide license to reproduce, distribute, perform and/or display publicly, prepare derivative versions, and otherwise use this contribution as part of the Python software and its related documentation, or any derivative versions thereof, at no cost to CNRI or its licensed users, and to authorize others to do so. I acknowledge that CNRI may, at its sole discretion, decide whether or not to incorporate this contribution in the Python software and its related documentation. I further grant CNRI permission to use my name and other identifying information provided to CNRI by me for use in connection with the Python software and its related documentation. ------=_NextPart_000_0071_01BFDADC.B3488D80 Content-Type: text/plain; name="string-patch.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="string-patch.txt" Index: Objects/stringobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/stringobject.c,v retrieving revision 2.68 diff -u -r2.68 stringobject.c --- Objects/stringobject.c 2000/06/14 09:18:09 2.68 +++ Objects/stringobject.c 2000/06/20 15:07:51 @@ -651,7 +651,7 @@ i = j = 0; while (i+n <= len) { - if (s[i] == sub[0] && (n == 1 || memcmp(s+i, sub, n) == 0)) { + if (s[i] == sub[0] && memcmp(s+i, sub, n) == 0) { if (maxsplit-- <= 0) break; item = PyString_FromStringAndSize(s+j, (int)(i-j)); @@ -852,8 +852,7 @@ return (long)i; last -= n; for (; i <= last; ++i) - if (s[i] == sub[0] && - (n == 1 || memcmp(&s[i+1], &sub[1], n-1) == 0)) + if (s[i] == sub[0] && memcmp(s+i, sub, n) == 0) return (long)i; } else { @@ -862,8 +861,7 @@ if (n == 0 && i <= last) return (long)last; for (j = last-n; j >= i; --j) - if (s[j] == sub[0] && - (n == 1 || memcmp(&s[j+1], &sub[1], n-1) == 0)) + if (s[j] == sub[0] && memcmp(s+j, sub, n) == 0) return (long)j; } @@ -1415,9 +1413,7 @@ len -= pat_len; for (ii = 0; ii <= len; ii++) { - if (mem[ii] == pat[0] && - (pat_len == 1 || - memcmp(&mem[ii+1], &pat[1], pat_len-1) == 0)) { + if (mem[ii] == pat[0] && memcmp(mem+ii, pat, pat_len) == 0) { return ii; } } ------=_NextPart_000_0071_01BFDADC.B3488D80--
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4