On Wed, Jan 19, 2011 at 07:11:52PM -0500, James Y Knight wrote: > On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote: > > This problem of which encoding to use is a problem that can be > > seen on UNIX systems even now. Try this: > > > > echo 'print("hi")' > café.py > > convmv -f utf-8 -t latin1 café.py > > python3 -c 'import café' > > > > ASCII seems very sensible to me when faced with these ambiguities. > > > > Other options I can brainstorm that could be explored: > > > > * Specify an encoding per platform and stick to that. (So, for instance, > > all module names on posix platforms would have to be utf-8). Force > > translation between encoding when installing packages (But that doesn't > > help for people that are creating their modules using their own build > > scripts rather than distutils, copying the files using raw tar, etc.) > > * Change import semantics to allow specifying the encoding of the module on > > the filesystem (seems really icky). > > None of this is unique to import -- the same exact issue occurs with open(u'café'). I don't see any reason why import café should be though of as more of a problem, or treated any differently. > It's unique in several ways: 1) With open, you can specify a byte string:: open(b'caf\xe9.py').read() I don't know of any way to do that with import. This is needed when the filename is not compatible with your current locale. 2) import assigns a name to the module that it imports whereas open lets the programmer assign the name. So even if you can specify what to use as a byte string for this filename on this particular filesystem you'd still end up with some ugly pseudo-representation of bytes when attempting to access it in code:: import caf\xe9 caf\xe9.do_something() -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20110119/088901cc/attachment.pgp>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4