Guido van Rossum wrote: > > > Guido> I think I can propose a compromise though: there may be two > > Guido> default encodings, one used for Python source code, and one > > Guido> for data. > > [Stephen J. Turnbull] > > Why go in this direction? It's better to allow each individual stream > > to specify a codec to be implicitly applied, I think. Consider Emacs, > > for example, which allows specification of default codecs for (1) file > > contents (2) names of file system objects (3) process I/O (but not I > > and O and E separately, which has caused problems!) (4) console input > > and (5) console output. All of those are plausible candidates for > > having separate defaults in Python as well. > > > > For example, in Japan it's easy to imagine a program with local file > > contents defaulting to UTF-8 (for cross-system portability) needing to > > access the Windows 9x console and file system in Shift JIS, while > > process (eg, network) I/O might be EUC-JP if the server were Unix. > > (Yes, I'm straining, but not much.) > > > > But if you allow codecs for each stream, people who want to have > > different defaults for certain classes of stream would just derive > > classes which initialized the default codec appropriately. > > Attaching codecs to streams is currently pretty painful AFAICT (I've > never tried it :-), but I think your idea has merit: there are > sufficiently many different contexts where an encoding must be > specified that it makes sense to allow setting different defaults for > the different contexts. The issue of filename encoding is one with > which we (well, some of us) have struggled recently. > > We'd have to > think more about which contexts exactly to consider; for now I can > come up with: > > - file I/O; > > - OS filenames; > > - implicit mixing of 8-bit and Unicode strings; > > - invocation of unicode(s) or u.decode() without an encoding. > > I see your proposal as a possible future generalization of my > two-encodings proposal, not as an incimpatible alternative. My position on this is *not* to introduce more defaults -- explicit is better than implicit and in this particular case (encodings) it'll result in a net win. > In the light of the post by Atsuo Ishimoto and the responses from both > Marc-Andre Lemburg and Martin von Loewis, however, I'm not sure > whether Suziki Hisao's response represents the Japanese community, and > it's possible that nothing needs to be done. Well, users using non-ASCII coding in their source files should start to be explicit about the encoding (in phase 1 they'll get a warning printed which makes them aware of the problem), but other than that, I don't see a need for changes to the strategy. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4