A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/1999-November/001303.html below:

[Python-Dev] just say no...

[Python-Dev] just say no...M.-A. Lemburg mal@lemburg.com
Tue, 16 Nov 1999 14:06:39 +0100
Greg Stein wrote:
> 
> On Mon, 15 Nov 1999, M.-A. Lemburg wrote:
> > Guido van Rossum wrote:
> >...
> > > t# refers to byte-encoded data.  Multibyte encodings are explicitly
> > > designed to be passed cleanly through processing steps that handle
> > > single-byte character data, as long as they are 8-bit clean and don't
> > > do too much processing.
> >
> > Ah, ok. I interpreted 8-bit to mean: 8 bits in length, not
> > "8-bit clean" as you obviously did.
> 
> Hrm. That might be dangerous. Many of the functions that use "t#" assume
> that each character is 8-bits long. i.e. the returned length == the number
> of characters.
> 
> I'm not sure what the implications would be if you interpret the semantics
> of "t#" as multi-byte characters.

FYI, the next version of the proposal now says "s#" gives you
UTF-16 and "t#" returns UTF-8. File objects opened in text mode
will use "t#" and binary ones use "s#".

I'll just use explicit u.encode('utf-8') calls if I want to write
UTF-8 to binary files -- perhaps everyone else should too ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    45 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/





RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4