On Tue, Jun 3, 2014 at 7:32 PM, Chris Angelico <rosuav at gmail.com> wrote: > On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano <steve at pearwood.info> > wrote: > > * Having a build-time option to restrict all strings to ASCII-only. > > > > (I think what they mean by that is that strings will be like Python 2 > > strings, ASCII-plus-arbitrary-bytes, not actually ASCII.) > > What I was actually suggesting along those lines was that the str type > still be notionally a Unicode string, but that any codepoints >127 > would either raise an exception or blow an assertion, and all the code > to handle multibyte representations would be compiled out. That would be a pretty lousy option. So there'd > still be a difference between strings of text and streams of bytes, > but all encoding and decoding to/from ASCII-compatible encodings would > just point to the same bytes in RAM. > I suppose this is why you propose to reject 128-255? > Risk: Someone would implement that with assertions, then compile with > assertions disabled, test only with ASCII, and have lurking bugs. > Never mind disabling assertions -- even with enabled assertions you'd have to expect most Python programs to fail with non-ASCII input. Then again the UTF-8 option would be pretty devastating too for anything manipulating strings (especially since many Python APIs are defined using indexes, e.g. the re module). Why not support variable-width strings like CPython 3.4? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140603/edbde954/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4