Recently, "M.-A. Lemburg" <mal@lemburg.com> said: > Jack Jansen wrote: > > > > Off on a slight tangent: > > On Mac OS X the default 8-bit encoding is UTF8. os.listdir() handles > > this fine and so does open(). The OS does all the hard work for > > you [...] > > But in Python (unix-Python we're talking here, not MacPython), > > unicode(filename) fails, because site.encoding is "ascii". > > > > Would it be safe to set site.encoding to utf8 on Mac OS X by default? > > I'd rather suggest to use UTF-8 as default encoding in the > subsystem layer I was talking about. Uhm... Do you mean Py_FileSystemDefaultEncoding? Otherwise: what do you mean? And, if you do mean Py_FSDE, would that also work for listdir()? No, I guess it can't because listdir() returns simple strings, so by the time I pass them to unicode() all knowledge that they came from listdir is gone... Hmm, shouldn't StringObjects themselves carry an encoding field (defaulting to sys.encoding)? That would solve quite a few issues. read() from a binary file would return the special encoding "binary", for instance, and then the "u" and "u#" formats could make a distinction between character strings (which would be converted to unicode using the encoding they carry) and binary strings (which would be interpreted as 16-bit chars). But interning may be a showstopper, now that I think of it... > Making UTF-8 the default Python system encoding would have many other > consequences -- and you'd probably lose a great deal of portability > since UTF-8 conversion (nearly) always will succeed while ASCII can > easily fail on other systems which use e.g. Latin-1 as native > encoding. What are your reasons for asserting this? If I read this correctly this would make Python compatible to the least common denominator of all platforms, while I think I would prefer it to allow access to all the niceties a platform gives. On Unix you really don't have a good guess for the encoding, but on MacOS and Windows you do... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4