Travis E. Oliphant wrote: > Currently that means that they are "unicode" strings of basic size UCS2 > or UCS4 depending on the platform. It is this duality that has some > people concerned. For all other data-types, NumPy allows the user to > explicitly request a bit-width for the data-type. Why is that a desirable property? Also: Why does have NumPy support for Unicode arrays in the first place? > Before embarking on this journey, however, we are seeking advice from > individuals wiser to the way of Unicode on this list. My initial reaction is: use whatever Python uses in "NumPy Unicode". Upon closer inspection, it is not all that clear what operations are supported on a Unicode array, and how these operations relate to the Python Unicode type. In any case, I think NumPy should have only a single "Unicode array" type (please do explain why having zero of them is insufficient). If the purpose of the type is to interoperate with a Python unicode object, it should use the same width (as this will allow for mempcy). If the purpose is to support arbitrary Unicode characters, it should use 4 bytes (as two bytes are insufficient to represent arbitrary Unicode characters). If the purpose is something else, please explain what the purpose is. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4