On Tue, 17 Jan 2006, Thomas Mangin wrote: [...] > I have hit a bug with python 2.4.2 (on Mandriva 2006) using urllib2. > The code which trigger the bug is as follow.. > > import urllib2 > req = urllib2.Request("http://66.117.37.13/") > > # makes no difference .. > req.add_header('Connection', 'close') > > handle = urllib2.urlopen(req) > data = handle.read() > print data > > using a timeout on the socket does not work neither. This is a real bug, I think. I filed a report on the SF bug tracker: http://python.org/sf/1411097 The problem seems to be the (ab)use of socket._fileobject in urllib2 (I believe this was introduced when urllib2 switched to using httplib.HTTPConnection). The purpose of the hack (as commented in AbstractHTTPHandler.do_open()) is to provide .readline() and .readlines() methods on the response object returned by urllib2.urlopen(). Workaround if you're not using .readline() or .readlines() (against 2.4.2, but should apply against current SVN): --- urllib2.py.orig Fri Jan 20 20:10:56 2006 +++ urllib2.py Fri Jan 20 20:12:07 2006 @@ -1006,8 +1006,7 @@ # XXX It might be better to extract the read buffering code # out of socket._fileobject() and into a base class. - r.recv = r.read - fp = socket._fileobject(r) + fp = r.fp resp = addinfourl(fp, r.msg, req.get_full_url()) resp.code = r.status Not sure yet what the actual problem/cure is... John
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4