RetroSearch Browse

Fri Sep 23 01:36:30 EDT 2005 · http://mail.python.org/pipermail/python-list/2005-September/337410.html

Hi,
   I've met a problem in match a regular expression in python. Hope
any of you could help me. Here are the details:

   I have many tags like this:
      xxx<a href="http://xxx.xxx.xxx" xxx>xxx
      xxx<a href="wap://xxx.xxx.xxx" xxx>xxx
      xxx<a href="http://xxx.xxx.xxx" xxx>xxx
      .....
   And I want to find all the "http://xxx.xxx.xxx" out, so I do it
like this:
      httpPat = re.compile("(<a )(href=\")(http://.*)(\")")
      result = httpPat.findall(data)
   I use this to observe my output:
      for i in result:
         print i[2]
   Surprisingly I will get some output like this:
      http://xxx.xxx.xxx">xxx</a>xxx
   In fact it's filtered from this kind of source:
      <a href="http://xxx.xxx.xxx">xxx</a>xxx"
   But some result are right, I wonder how can I get the all the
answers clean like "http://xxx.xxx.xxx"? Thanks for your help.

Regards,
Johnny

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-list/2005-September/337410.html below:

Help on regular expression match