Hello, I cannot understand why I get the following error. 1) Here is an extract of the web page content I want to parse : ------------- function RedirectPage() { document.location ="http://www.nicewebsite.com"; } ------------- 2) Here is the program I use to extract the URL to which I need to redirect my browsing (ugly, I agree) ------------- q=re.compile("document.location =.*") t2=q.search(f3) # f3 contains the full string of the web page b= t2.group() q1=re.compile('[^document.location ="].*[^";]') t4=q1.search(b) redirectURL=t4.group() f4=urllib.urlopen(redirectURL).read() -------------- The program successfully extracts the name of the Url. I can print the exact text "http://www.nicewebsite.com" (without the quotes). 3) And finally here is the traceback I get : ----------------------- Traceback (innermost last): File "./MyProgram", line 157, in ? f4=urllib.urlopen(redirectURL).read() File "/usr/lib/python1.5/urllib.py", line 59, in urlopen return _urlopener.open(url) File "/usr/lib/python1.5/urllib.py", line 157, in open return getattr(self, name)(url) File "/usr/lib/python1.5/urllib.py", line 272, in open_http return self.http_error(url, fp, errcode, errmsg, headers) File "/usr/lib/python1.5/urllib.py", line 285, in http_error result = method(url, fp, errcode, errmsg, headers) File "/usr/lib/python1.5/urllib.py", line 456, in http_error_302 return self.open(newurl, data) File "/usr/lib/python1.5/urllib.py", line 157, in open return getattr(self, name)(url) File "/usr/lib/python1.5/urllib.py", line 247, in open_http if not host: raise IOError, ('http error', 'no host given') IOError: [Errno http error] no host given --------------------------- For professional reason, I had to change the web site name above. Could in any case urllib() be sensitive to the content of the requested URL (full of "&" chars) ? Thank you for any help. --- Patrick Bussi patrick.bussi at space.alcatel.fr Any opinions expressed are my own and not necessarily those of my Company.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4