[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().] Thomas, I really don't understand it. The getc() source code you showed calls getc_unlocked(). So how can it be faster? The answer must be somewhere else... Cache line conflicts, the rewriting of the loop that I did, a compiler bug, the inlining, who knows. Can you compare the generated assembly code? On other platforms, getc_unlocked() typically speeds the readline() test case up by a significant factor (as in your BSDI numbers, where it's almost 3x faster). Could it be that you're mistaken and that somehow getc_unlocked() is *not* chosen on FreeBSD? Then I could believe it, the rewritten loop is so different that the optimizer might have done something different to it. (Check config.h. When all else fails, I put an #error in the #ifdef branch that I expect not to be taken.) Could it be that somehow getc_unlocked() is later defined to be the same as getc(), so choosing it just adds the overhead of calling f[un]lockfile() for each line? --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4