> I've been following up a thread on python-list about lousy performance of > urllib.urlopen(...).read() on FreeBSD 4.x comparted to using wget to > retrieve the same file. > > I've determined that the following patch (against 2.2.2) makes an enormous > difference in throughput: > > -----8<-----8<-----8<----- > *** Lib/httplib.py.orig Mon Oct 7 11:18:17 2002 > --- Lib/httplib.py Sun Nov 24 14:44:16 2002 > *************** > *** 210,216 **** > # See RFC 2616 sec 19.6 and RFC 1945 sec 6 for details. > > def __init__(self, sock, debuglevel=0, strict=0): > ! self.fp = sock.makefile('rb', 0) > self.debuglevel = debuglevel > self.strict = strict > > --- 210,216 ---- > # See RFC 2616 sec 19.6 and RFC 1945 sec 6 for details. > > def __init__(self, sock, debuglevel=0, strict=0): > ! self.fp = sock.makefile('rb', -1) > self.debuglevel = debuglevel > self.strict = strict > > -----8<-----8<-----8<----- > > Without this patch, d/l a 4MB file from localhost gets a bit over 110kB/s, > with the patch gets 4-5.5MB/s on the same system (FBSD 4.4 SMP, 2xC300A, > 128MB RAM, ATA66 HD). > > My question: > > - why is the socket.fp being set to unbuffered? I can't make time for a full essay on the issue, but I believe that it must be unbuffered because some applications want to read until the end of the headers and then pass the file descriptor to a subprocess or to code that uses the socket directly. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4