On Mon, 2004-06-21 at 12:43, Raymond Hettinger wrote: > Without expanding the size of the string structure, I think it is > possible to implement something equivalent to the following proposal > (which for clarity, does add two words to the structure). Roughly, the > idea is: I have written many applications that store millions of strings. I don't know how keen I am on adding more eight more bytes of storage. An eight-byte string currently consumes 28 bytes of storage; the proposal would bump it up to 36 bytes. > Add two PyObject pointers to the structure, component0 and component1, > and initialize them to NULL. > > Alter string_concat(a, b) to return a new string object with: > ob_sval = '\0' > ob_size = len(a) + len(b) > component0 = a // be sure to INCREF(a) > component1 = b // be sure to INCREF(b) It sounds like you're proposing a string implementation know as "ropes." See: http://www.sgi.com/tech/stl/ropeimpl.html http://citeseer.ist.psu.edu/boehm95ropes.html Are did I misunderstand? I think there's some merit to the idea. My initial reaction is that the data structure seems a bit complex. Strings are nice and simple. Another potential problem is that a lot of code purports to understand the internal representation of strings; that is, they use PyString_AS_STRING. It would be pretty easy to develop an alternative string implementation and do some performance tests without integrating it into the core. That would identify the gross characteristics. I assume most of the strings used internally, e.g. variable and attribute names, would almost always be simple strings and, thus, wouldn't be affected much by a different implementation. Jeremy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4