Victor Stinner <victor.stinner at gmail.com> wrote: > > 'c' -> UCS1 > > 'u' -> UCS2 > > 'w' -> UCS4 > > A Unicode string is an array of code point. Another approach is to > expose such string as an array of uint8/uint16/uint32 integers. I > don't know if you expect to get a character / a substring when you > read the buffer of a string object. Using Python 3.2, I get: > > >>> memoryview(b"abc")[0] > b'a' > > ... but using Python 3.3 I get a number :-) Yes, that's changed because officially (see struct module) the format is unsigned bytes, which are integers in struct module syntax: >>> unsigned_bytes = memoryview(b"abc") >>> unsigned_bytes.format 'B' >>> char_array = unsigned_bytes.cast('c') >>> char_array.format 'c' >>> char_array[0] b'a' Possibly the uint8/uint16/uint32 integer approach that you mention would make more sense. Stefan Krah
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4