[Eric S. Raymond] > ... > What you get by going with a dictionary representation is that > membership test becomes close to constant-time, while insertion and > deletion become sometimes cheap and sometimes quite expensive > (depending of course on whether you have to allocate a new > hash bucket). Note that Python's dicts aren't vulnerable to that: they use open addressing in a contiguous, preallocated vector. There are no mallocs() or free()s going on for lookups, deletes, or inserts, unless an insert happens to hit a "time to double the size of the vector" boundary. Deletes never cost more than a lookup; inserts never more unless the table-size boundary is hit (one in 2**N unique inserts, at which point N goes up too). > ... > "works for everbody" isn't really possible here. So my solution > does the next best thing -- pick a choice of tradeoffs that isn't > obviously worse than the alternatives and keeps things bog-simple. I agree that this shouldn't be an either/or choice, but if it's going to be forced into that mold I have to protest that the performance of unordered lists would kill most of the set applications I've ever had. I typically have a small number of very large sets (and I'm talking not 100s, but often 100s of 1000s of elements). The relatively large memory burden of a dict representation wouldn't bother me unless I instead had 100s of 1000s of very small sets. which-we-may-happen-in-my-next-life-but-not-in-this-one-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4