Le mardi 23 août 2011 à 13:51 +0200, "Martin v. Löwis" a écrit : > > This optimization was done when trying to improve the speed of text I/O. > > So what speedup did it achieve, for the kind of data you talked about? Since I don't have the number anymore, I've just saved the contents of https://linuxfr.org/news/le-noyau-linux-est-disponible-en-version%C2%A030 as a "linuxfr.html" file and then did: $ ./python -m timeit "with open('linuxfr.html', encoding='utf8') as f: f.read()" 1000 loops, best of 3: 859 usec per loop After disabling the fast path, I ran the micro-benchmark again: $ ./python -m timeit "with open('linuxfr.html', encoding='utf8') as f: f.read()" 1000 loops, best of 3: 1.09 msec per loop so that's a 20% speedup. > > Do you have three copies of the UTF-8 decoder already, or do you a use a > > stringlib-like approach? > > It's a single implementation - see for yourself. So why would you need three separate implementation of the unrolled loop? You already have a macro named WRITE_FLEXIBLE_OR_WSTR. Even without taking into account the unrolled loop, I wonder how much slower UTF-8 decoding becomes with that approach, by the way. Instead of testing the "kind" variable at each loop iteration, using a stringlib-like approach may be a better deal IMO. Of course we would first need to have various benchmark numbers once the current PEP 393 implementation is complete. Regards Antoine.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4