"Fred L. Drake, Jr." wrote: > > M.-A. Lemburg writes: > > Such a buffer is needed to implement "s" and "s#" argument > > parsing. It's a simple requirement to support those two > > parsing markers -- there's not much to argue about, really... > > unless, of course, you want to give up Unicode object support > > for all APIs using these parsers. > > Perhaps I missed the agreement that these should always receive > UTF-8 from Unicode strings. Was this agreed upon, or has it simply > not been argued over in favor of other topics? It's been in the proposal since version 0.1. The idea is to provide a decent way of making existing script Unicode aware. > If this has indeed been agreed upon... at least it can be computed > on demand rather than at initialization! This is what I intended to implement. The <defencbuf> buffer will be filled upon the first request to the UTF-8 encoding. "s" and "s#" are examples of such requests. The buffer will remain intact until the object is destroyed (since other code could store the pointer received via e.g. "s"). > Perhaps there should be two > pointers: one to the UTF-8 buffer and one to a PyObject; if the > PyObject is there it's a "old-style" string that's actually providing > the buffer. This may or may not be a good idea; there's a lot of > memory expense for long Unicode strings converted from UTF-8 that > aren't ever converted back to UTF-8 or accessed using "s" or "s#". > Ok, I've talked myself out of that. ;-) Note that Unicode object are completely different beast ;-) String object are not touched in any way by the proposal. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 49 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4