On May 6, 2005, at 2:49 PM, Nicholas Bastin wrote: > If this is the case, then we're clearly misleading users. If the > configure script says UCS-2, then as a user I would assume that > surrogate pairs would *not* be encoded, because I chose UCS-2, and it > doesn't support that. I would assume that any UTF-16 string I would > read would be transcoded into the internal type (UCS-2), and > information would be lost. If this is not the case, then what does the > configure option mean? It means all the string operations treat strings as if they were UCS-2, but that in actuality, they are UTF-16. Same as the case in the windows APIs and Java. That is, all string operations are essentially broken, because they're operating on encoded bytes, not characters, but claim to be operating on characters. James
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4