On 6 May 2017 at 18:00, Nick Coghlan <ncoghlan at gmail.com> wrote: > On 5 March 2017 at 17:50, Nick Coghlan <ncoghlan at gmail.com> wrote: >> Hi folks, >> >> Late last year I started working on a change to the CPython CLI (*not* the >> shared library) to get it to coerce the legacy C locale to something based >> on UTF-8 when a suitable locale is available. >> >> After a couple of rounds of iteration on linux-sig and python-ideas, I'm now >> bringing it to python-dev as a concrete proposal for Python 3.7. >> >> For most folks, reading the Abstract plus the draft docs updates in the >> reference implementation will tell you everything you need to know (if the >> C.UTF-8, C.utf8 or UTF-8 locales are available, the CLI will automatically >> attempt to coerce the legacy C locale to one of those rather than persisting >> with the latter's default assumption of ASCII as the preferred text >> encoding). > > I've just pushed a significant update to the PEP based on the > discussions in this thread: > https://github.com/python/peps/commit/2fb53e7c1bbb04e1321bca11cc0112aec69f6398 > > The main change at the technical level is to modify the handling of > the coercion target locales such that they *always* lead to > "surrogateescape" being used by default on the standard streams. That > means we don't need to call "Py_SetStandardStreamEncoding" during > startup, that subprocesses will behave the same way as their parent > processes, and that Python in Linux containers will behave > consistently regardless of whether the container locale is set to > "C.UTF-8" explicitly, or is set to "C" and then coerced to "C.UTF-8" > by CPython. Working on the revised implementation for this, I've ended up refactoring it so that all the heavy lifting is done by a single function exported from the shared library: "_Py_CoerceLegacyLocale()". The CLI code then just contains the check that says "Are we running in the legacy C locale? If so, call _Py_CoerceLegacyLocale()", with all the details of how the coercion actually works being hidden away inside pylifecycle.c. That seems like a potential opportunity to make the 3.7 version of this a public API, using the following pattern: if (Py_LegacyLocaleDetected()) { Py_CoerceLegacyLocale(); } That way applications embedding CPython that wanted to implement the same locale coercion logic would have an easy way to do so. Thoughts? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4