[Swede] > >in the current 're' engine, a newline is chr(10) and nothing > >else. > > > >however, in the new unicode aware engine, I used the new > >LINEBREAK predicate instead, but it turned out to break one > >of the tests in the current test suite: > > > > sre.match('a\rb', 'a.b') => None > > > >(unicode adds chr(13), chr(28), chr(29), chr(30), and also > >unichr(133), unichr(8232), and unichr(8233) to the list of > >line breaking codes) > > > >what's the best way to deal with this? I see three alter- > >natives: > > > >a) stick to the old definition, and use chr(10) also for > > unicode strings [Finn] > In the ORO matcher that comes with jpython, the dot matches all but > chr(10). But that is bad IMO. Unicode should use the LINEBREAK > predicate. There's no need for invention. We're supposed to be as close to Perl as reasonable. What does Perl do? --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4