One recommendation: for starters, I'd much rather see the bytes type standardized without a literal notation. There should be are lots of ways to create bytes objects from string objects, with specific explicit encodings, and those should suffice, at least initially. I also wonder if having a b"..." literal would just add more confusion -- bytes are not characters, but b"..." makes it appear as if they are. --Guido On 2/11/06, Bengt Richter <bokr at oz.net> wrote: > On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum <guido at python.org> wrote: > > >> On Sat, 11 Feb 2006 05:08:09 +0000 (UTC), Neil Schemenauer <nas at arctrix.com> > >The backwards compatibility problems *seem* to be relatively minor. > >> >I only found one instance of breakage in the standard library. Note > >> >that my patch does not change PyObject_Str(); that would break > >> >massive amounts of code. Instead, I introduce a new function: > >> >PyString_New(). I'm not crazy about the name but I couldn't think > >> >of anything better. > > > >On 2/10/06, Bengt Richter <bokr at oz.net> wrote: > >> Should this not be coordinated with PEP 332? > > > >Probably.. But that PEP is rather incomplete. Wanna work on fixing that? > > > I'd be glad to add my thoughts, but first of course it's Skip's PEP, > and Martin casts a long shadow when it comes to character coding issues > that I suspect will have to be considered. > > (E.g., if there is a b'...' literal for bytes, the actual characters of > the source code itself that the literal is being expressed in could be ascii > or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the source > is at least temporarily normalized to Unicode, and then re-encoded (except now > for string literals?) per coding cookie or other encoding inference. (I may be > out of date, gotta catch up). > > If one way or the other a string literal is in Unicode, then presumably so is > a byte string b'...' literal -- i.e. internally u"b'...'" just before > being turned into bytes. > > Should that then be an internal straight u"b'...'".encode('byte') with default ascii + escapes > for non-ascii and non-printables, to define the full 8 bits without encoding error? > Should unicode be encodable into byte via a specific encoding? E.g., u'abc'.encode('byte','latin1'), > to distinguish producing a mutable byte string vs an immutable str type as with u'abc'.encode('latin1'). > (but how does this play with str being able to produce unicode? And when do these changes happen?) > I guess I'm getting ahead of myself ;-) > > So I would first ask Skip what he'd like to do, and Martin for some hints on reading, to avoid > going down paths he already knows lead to brick walls ;-) And I need to think more about PEP 349. > > I would propose to do the reading they suggest, and edit up a new version of pep-0332.txt > that anyone could then improve further. I don't know about an early deadline. I don't want > to over-commit, as time and energies vary. OTOH, as you've noticed, I could be spending my > time more effectively ;-) > > I changed the thread title, and will wait for some signs from you, Skip, Martin, Neil, and I don't > know who else might be interested... > > Regards, > Bengt Richter > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4