RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://mail.python.org/pipermail/python-dev/2001-May/014586.html below:

[Python-Dev] "".tokenize() ?

[Python-Dev] "".tokenize() ?Fredrik Lundh fredrik@pythonware.com
Fri, 4 May 2001 12:50:06 +0200

Previous message: [Python-Dev] "".tokenize() ?
Next message: [Python-Dev] "".tokenize() ?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

mal wrote:

> > > "one, two and three".tokenize([",", "and"])
> > > -> ["one", " two ", "three"]
> > >
> > > I like this method -- should I review the code and then check it in ?
> >
> > -1.  method bloat.  not exactly something you do every day, and
> > when you do, it's a one-liner:
> >
> > def tokenize(string, ignore):
> >     [word for word in re.findall("\w+", string) if not word in ignore]
>
> This is not the same as what .tokenize() does: it cut at each
> occurrance of a substring rather than words as in your example

oh, I didn't see the spaces.  splitting on all substrings is even
easier (but perhaps a bit more obscure, at least when written
on one line):

def tokenize(string, seps):
    return re.split("|".join(map(re.escape, seps)), string)

Cheers /F

Previous message: [Python-Dev] "".tokenize() ?
Next message: [Python-Dev] "".tokenize() ?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4