> Greg Stein <gstein@lyra.org> wrote: > > > > > > >>> import unicode > > > > > > >>> import marshal > > > > > > >>> u = unicode.unicode > > > > > > >>> s = u("foo") > > > > > > >>> data = marshal.dumps(s) > > > > > > >>> marshal.loads(data) > > > > > > 'f\000o\000o\000' > > > > > > >>> type(marshal.loads(data)) > > > > > > <type 'string'> > > > > > > Why do Unicode objects implement the bf_getcharbuffer slot ? I thought > > > > that unicode objects use a two-byte character representation. > > > > Unicode objects should *not* implement the getcharbuffer slot. Only > > read, write, and segcount. > > unicode objects do not implement the getcharbuffer slot. > here's the relevant descriptor: > > static PyBufferProcs unicode_as_buffer = { > (getreadbufferproc) unicode_buffer_getreadbuf, > (getwritebufferproc) unicode_buffer_getwritebuf, > (getsegcountproc) unicode_buffer_getsegcount > }; > > the array module uses a similar descriptor. > > maybe the unicode class shouldn't implement the > buffer interface at all? sure looks like the best way > to avoid trivial mistakes (the current behaviour of > fp.write(unicodeobj) is even more serious than the > marshal glitch...) > > or maybe the buffer design needs an overhaul? I think most places that should use the charbuffer interface actually use the readbuffer interface. This is what should be fixed. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4