On 28 October 2017 at 16:48, Steven D'Aprano <steve at pearwood.info> wrote: > On Sun, Oct 29, 2017 at 12:31:01AM +0100, MRAB wrote: > > > Not that I'm planning on making any further additions, just bug fixes > > and updates to follow the Unicode updates. I think I've crammed enough > > into it already. There's only so much you can do with the regex syntax > > with its handful of metacharacters and possible escape sequences... > > What do you think of the Perl 6 regex syntax? > > https://en.wikipedia.org/wiki/Perl_6_rules#Changes_from_Perl_5 ​If you're going to change the notation, why not use notations similar to what linguists use for FSTs? These allow building FSTs (with operations such as adding/subtracting/composing/projecting FSTs) with millions of states — and there are some impressive optimisers for them also, so that encoding a dictionary with inflections is both more compact and faster than a hash of just the words without inflections. Some of this work is open source, but I haven't kept up with it. If you're interested, you can start here: http://web.stanford.edu/~laurik/​ http://web.stanford.edu/~laurik/publications/TR-2010-01.pdf http://web.stanford.edu/group/cslipublications/cslipublications/site/1575864347.shtml etc. ;) > > > > > -- > Steve > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > pludemann%40google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20171028/97f2fe8d/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4