"Fred L. Drake" wrote: > > On Tue, 23 May 2000, M.-A. Lemburg wrote: > > The problem is that "s" and "t" return C pointers to some > > internal data structure of the object. It has to be assured > > that this data remains intact at least as long as the object > > itself exists. > > > > AFAIK, this cannot be fixed without creating a memory leak. > > > > The "es" parser marker uses a different strategy, BTW: the > > data is copied into a buffer, thus detaching the object > > from the data. > > > > > > C APIs which want to support Unicode should be fixed to use > > > > "es" or query the object directly and then apply proper, possibly > > > > OS dependent conversion. > > > > > > for convenience, it might be a good idea to have a "wide system > > > encoding" too, and special parser markers for that purpose. > > > > > > or can we assume that all wide system API's use unicode all the > > > time? > > > > At least in all references I've seen (e.g. ODBC, wchar_t > > implementations, etc.) "wide" refers to Unicode. > > On Linux, wchar_t is 4 bytes; that's not just Unicode. Doesn't ISO > 10646 require a 32-bit space? It is, Unicode is definitely moving in the 32-bit direction. > I recall a fair bit of discussion about wchar_t when it was introduced > to ANSI C, and the character set and encoding were specifically not made > part of the specification. Making a requirement that wchar_t be Unicode > doesn't make a lot of sense, and opens up potential portability issues. > > -1 on any assumption that wchar_t is usefully portable. Ok... so could be that Fredrik has a point there, but I'm not deep enough into this to be able to comment. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4