A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2000-February/002326.html below:

how to build strings from lots of slices?

[Python-Dev] RFD: how to build strings from lots of slices?Tim Peters tim_one@email.msn.com
Sun, 27 Feb 2000 18:19:52 -0500
[/F, upon the reinvention of substring descriptors]
> ...
> a) bad memory behaviour if you slice small strings out
> of huge input strings -- which may surprise newbies.

Experts too.  Dragon has gobs of code that copies little strings via loops
in Java and C++, because Java's and MFC's descriptor-based string classes
routinely keep a megabyte string alive after you've sliced out the 3 bytes
<0.5 wink> you needed.  Last year my group finally wrote its own string
classes, to just copy the damn things.  Performance improvement was
significant (both space & time).

Boehm's "cords"/"ropes" (he's the primary author of both pkgs JC mentioned)
were specifically designed to support efficient random & repeated editing of
giant mutable strings -- agree with Guido that it's overall major loss for
pedestrian uses.  Heck, why not implement strings as giant B-trees like the
Tcl text widget does <wink>.

> b) harder to interface to underlying C libraries -- the
> current string implementation guarantees that a Python
> string is also a C string (with a trailing null).

c) For apps that use oodles of short strings, the space overhead of
maintaining descriptors exceeds that of making copies.  A buddy in Sun's
Java development group tells me Java is despised for this by Major Players
in the DB world; so don't be surprised if Java eventually drops the
descriptor idea too (or, more Java-like, introduces 5 new flavors of strings
<0.7 wink>).

So there's no pure win here.  Python's current scheme is at least
predictable, and by everyone, with finite effort.  Agree you have a
particular good but limited use it for it, though, and Greg's suggestion of
using buffer objects under the covers is almost certainly "the right" idea.





RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4