M.-A. Lemburg wrote: > Travis E. Oliphant wrote: >> M.-A. Lemburg wrote: >>> Travis E. Oliphant wrote: >>>> ------------------------------------------------------------------------ >>>> >>>> PEP: <unassigned> >>>> Title: Adding data-type objects to the standard library >>>> Attributes >>>> >>>> kind -- returns the basic "kind" of the data-type. The basic kinds >>>> are: >>>> 't' - bit, >>>> 'b' - bool, >>>> 'i' - signed integer, >>>> 'u' - unsigned integer, >>>> 'f' - floating point, >>>> 'c' - complex floating point, >>>> 'S' - string (fixed-length sequence of char), >>>> 'U' - fixed length sequence of UCS4, >>> Shouldn't this read "fixed length sequence of Unicode" ?! >>> The underlying code unit format (UCS2 and UCS4) depends on the >>> Python version. >> Well, in NumPy 'U' always means UCS4. So, I just copied that over. See >> my questions at the bottom which talk about how to handle this. A >> data-format does not necessarily have to correspond to something Python >> represents with an Object. > > Ok, but why are you being specific about UCS4 (which is an internal > storage format), while you are not specific about e.g. the > internal bit size of the integers (which could be 32 or 64 bit) ? > The 'kind' does not specify how "big" the data-type (data-format) is. A number is needed to represent the number of bytes. In this case, the 'kind' does not specify how large the data-type is. You can have 'u1', 'u2', 'u4', etc. The same is true with Unicode. You can have 10-character unicode elements, 20-character, etc. But, we have to be clear about what a "character" is in the data-format. -Travis
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4