> > > Mostly because there is no guarantee that every .write method will > > > support Unicode objects. I see two options: either a stream might > > > declare itself as supporting unicode on output (by, say, providing a > > > unicode attribute), or all streams are required by BDFL pronouncement > > > to accept Unicode objects. > > > > I think the latter option would go a long way: many file-like > > objects are written in C and will use the C parser markers. These > > can handle Unicode without problem (issuing an exception in case > > the conversion to ASCII fails). > > Agreed, but BDFL pronouncement doesn't make it so: individual modules > still have to be modified if they don't do the right thing (especially > 3rd party modules -- we have no control there). > > And then, what's the point of handling Unicode if we only accept > Unicode-encoded ASCII strings? By accepting Unicode, I would specifically require that they, at least: - do not crash the interpreter when being passed Unicode objects - attempt to perform some conversion if they do not support Unicode directly; if they don't know any specific conversion, the default conversion should be used (i.e. that they don't give a TypeError). With these assumptions, it is possible to allow print to pass Unicode objects to the file's write method, instead of converting Unicode itself. This, in turn, enables users to replace sys.stdout with something that supports a different encoding. Of course, you still may get Unicode errors, since some streams may not support all Unicode characters (e.g. since the terminal does not support them). Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4