On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan <ncoghlan at gmail.com> wrote: > On 10 January 2014 13:32, Lennart Regebro <regebro at gmail.com> wrote: >> No, because your environment have a default language. And Python has a >> default encoding. You only get problems when some file doesn't use the >> default encoding. > > The reason Python 3 currently tries to rely on the POSIX locale > encoding is that during the Python 3 development process it was > pointed out that ShiftJIS, ISO-2022 and various CJK codec are in > widespread use in Asia, since Asian users needed solutions to the > problem of representing kana, ideographs and other non-Latin > characters long before the Unicode Consortium existed. > > This creates a problem for Python 3, as assuming utf-8 means we have a > high risk of corrupting user's data at least in Asian locales, as well > as anywhere else where non-UTF-8 encodings are common (especially when > encodings that aren't ASCII compatible are involved). >From my experience, the concept of a default locale is deeply flawed. What if I log into a (Linux) machine using an old latin-1 putty from the Windows XP era, have most file names and contents in UTF-8 encoding, except for one directory where people from eastern Europe upload files via FTP in whatever encoding they choose. What should the "default" encoding be now? That's why I make it a principle to always unset all LC_* and LANG variables, except when working locally, which happens rather rarely.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4