On 13 March 2017 at 18:37, INADA Naoki <songofacandy at gmail.com> wrote: > But locale coercing works nice on platforms like android. > So how about simplified version of PEP 538? Just adding configure > option for locale coercing > which is disabled by default. No envvar options and no warnings. > That doesn't solve my original Linux distro problem, where locale misconfiguration problems show up as "Python 2 works, Python 3 doesn't work" behaviour and bug reports. The problem is that where Python 2 was largely locale-independent by default (just passing raw bytes through) such that you'd only get immediate encoding or decoding errors if you had a Unicode literal or a decode() call somewhere in your code and would otherwise pass data corruption problems further down the chain, Python 3 is locale-*aware* by default, and eagerly decodes: - command line parameters - environment variables - responses from operating system API calls - standard stream input - file contents You *can* still write locale-independent Python 3 applications, but they involve sprinkling liberal doses of "b" prefixes and suffixes and mode settings and "surrogateescape" error handler declarations in various places - you can't just run python-modernize over a pre-existing Python 2 application and expect it to behave the same way in the C locale as it did before. Once implemented, PEP 540 will partially solve the problem by introducing a locale independent UTF-8 mode, but that still leaves the inconsistency with other locale-aware components that are needing to deal with Python 3 API calls that accept or return Unicode objects where Python 2 allowed the use of 8-bit strings. Folks that really want the old behaviour back will be able to set PYTHONCOERCECLOCALE=0 (as that no longer emits any warnings), or else build their own CPython from source using `--without-c-locale-coercion` and ``--without-c-locale-warning`. However, they'll also get the explicit support notification from PEP 11 that any Unicode handling bugs they run into in those configurations are entirely their own problem - we won't fix them, because we consider those configurations unsupportable in the general case. That puts the additional self-support burden on folks doing something unusual (i.e. insisting on running an ASCII-only environment in 2017), rather than on those with a more conventional use case (i.e. running an up to date \*nix OS using UTF-8 or another universal encoding for both local and remote interfaces). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20170313/4e7f7064/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4