[Tim speculates on getc_unlocked and his ms_getline_hack]: > > So ms_getline_hack is significantly faster on your box (I'm only > looking at while_readline: 11 using getc_unlocked, 8.3 using > ms_getline_hack). There are only two reasons I can imagine for that: > > 1. Your vendor optimizes the inner loop in fgets (as all vendors > should, but few do). Digital engineering, Compaq management/marketing <0.6 wink> > > and/or > > 2. Despite the long average length of your lines, many of them are > nevertheless shorter than 200 chars, and so all the pain > ms_getline_hack endures to avoid a realloc pays off. > > Unfortunately, there's not enough info to figure out if either, both, > or none of those are on-target. It's such a large percentage > speedup, though, that my bet goes primarily to #1 -- unless realloc > is really pig slow on your box. The lines range in length from 96 to 747 characters, with 11% @ 233, 17% @ 252 and 52% @ 254 characters, so #1 looks promising - most lines are long enough to trigger a realloc. Cranking up INITBUFSIZE in ms_getline_hack to 260 from 200 improves thing again, by another 25%: total 131426612 chars and 514216 lines count_chars_lines 5.081 5.066 readlines_sizehint 3.743 3.717 using_fileinput 11.113 11.100 while_readline 6.100 6.083 for_xreadlines 3.027 3.033 Apart from the name <grin>, I like ms_getline_hack... tho'-a-factor-of-100-makes-xreadlines-a-welcome-addition!-ly y'rs -- Mark Favas - m.favas@per.dem.csiro.au CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4