--tKW2IUtsqtDRztdT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Guido van Rossum <guido@digicool.com>: > I understand that you have probably given this more thought than I > have recently, so I'd like to see your more detailed analysis of what > you do and don't like about Greg's proposal! I've already covered my big objection, the fact that it doesn't support the degree of polymorphic crossover one might expect with sequence types (and Greg has agreed that I have a point there). Another problem is the lack of support for mutable elements (and yes, I'm quite aware of the problems with this.) One thing I do like is the proposal for an actual set input syntax. Of course this would require that the set type become one of the builtins, with compiler support. > I haven't read your docs yet (and no time because Digital Creations is > requiring my attention all of today), but I expect that designing a > universal set type, one that is good enough to be used in all sorts of > applications, is very difficult. For "difficult" read "can't be done". This is one of those cases where no matter what implementation you choose, some of the operations you want to be cheap will be worst-case quadratic. Life is like that. So I chose a dead-simple representation and accepted quadratic times for union/intersection. > > 2. It's simple for application programmers to use. No extension module > > to integrate. > > This is a silly argument for wanting something to be added to the > core. If it's part of the core, the need for an extension is > immaterial because that extension will always be available. So > I conclude that your module is set up perfectly for a popular module > in the Vaults. :-) Reasonable point. > > 3. It's unsurprising. My set objects behave almost exactly like other > > mutable sequences, with all the same built-in methods working, except for > > the fact that you can't introduce duplicates with the mutators. > > Ah, so you see a set as an extension of a sequence. That may be the > big rift between your version and Greg's PEP: are sets more like > sequences or more like dictionaries? Indeed it is. > > 5. It's simple enough not to cause you maintainance hassles down the > > road, and even if it did the maintainer is unlikely to disappear :-). > > I'll be the judge of that, and since you prefer not to show your > source code (why is that?), I can't tell yet. No nefarious concealment going on here here :-), I've sent versions of the code to Greg and Ping already. I'll shoot you a copy too. > Having just skimmed your docs, I'm disappointed that you choose lists > as your fundamental representation type -- this makes it slow to test > for membership and hence makes intersection and union slow. Not quite. Membership test is still linear-time; so is adding and deleting elements. It's true that union and intersection are quadratic, but see below. > I suppose > that you have evidence from using this that those operations aren't > used much, or not for large sets? Exactly! In my experience the usage pattern of a class like this runs heavily to small sets (usually < 64 elements); membership tests dominate usage, with addition and deletion of elements running second and the "classical" boolean operations like union and intersection being uncommon. What you get by going with a dictionary representation is that membership test becomes close to constant-time, while insertion and deletion become sometimes cheap and sometimes quite expensive (depending of course on whether you have to allocate a new hash bucket). Given the usage pattern I described, the overall difference in performance is marginal. > This is one of the problems with > coming up with a set type for the core: it has to work for (nearly) > everybody. As I pointed out above (and someone else on the list had made the same point earlier), "works for everbody" isn't really possible here. So my solution does the next best thing -- pick a choice of tradeoffs that isn't obviously worse than the alternatives and keeps things bog-simple. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> Alcohol still kills more people every year than all `illegal' drugs put together, and Prohibition only made it worse. Oppose the War On Some Drugs! --tKW2IUtsqtDRztdT Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="set.py" """ A set-algebra module for Python. The functions work on any sequence type and return lists. The set methods can take a set or any sequence type as an argument. They are insensitive to the types of the elements. Lists are used rather than dictionaries so the elements can be mutable. """ # Design and implementation by ESR, January 2001. def setify(list1): # Used by set constructor "Remove duplicates in sequence." res = [] for i in range(len(list1)): duplicate = 0 for j in range(i): if list1[i] == list1[j]: duplicate = 1 break if not duplicate: res.append(list1[i]) return res def union(list1, list2): # Used for | "Compute set intersection of sequences." res = list1[:] for x in list2: if not x in list1: res.append(x) return res def intersection(list1, list2): # Used for & "Compute set intersection of sequences." res = [] for x in list1: if x in list2: res.append(x) return res def difference(list1, list2): # Used for - "Compute set difference of sequences." res = [] for x in list1: if not x in list2: res.append(x) return res def symmetric_difference(list1, list2): # Used for ^ "Compute set symmetric-difference of sequences." res = [] for x in list1: if not x in list2: res.append(x) for x in list2: if not x in list1: res.append(x) return res def cartesian(list1, list2): # Used for * "Cartesian product of sequences considered as sets." res = [] for x in list1: for y in list2: res.append((x,y)) return res def equality(list1, list2): "Test sequences considered as sets for equality." if len(list1) != len(list2): return 0 for x in list1: if not x in list2: return 0 for x in list2: if not x in list1: return 0 return 1 def proper_subset(list1, list2): "Return 1 if first argument is a proper subset of second, 0 otherwise." if not len(list1) < len(list2): return 0 for x in list1: if not x in list2: return 0 return 1 def subset(list1, list2): "Return 1 if first argument is a subset of second, 0 otherwise." if not len(list1) <= len(list2): return 0 for x in list1: if not x in list2: return 0 return 1 def powerset(base): "Compute the set of all subsets of a set." powerset = [] for n in xrange(2 ** len(base)): subset = [] for e in xrange(len(base)): if n & 2 ** e: subset.append(base[e]) powerset.append(subset) return powerset class set: "Lists with set-theoretic operations." def __init__(self, value): self.elements = setify(value) def __len__(self): return len(self.elements) def __getitem__(self, ind): return self.elements[ind] def __setitem__(self, ind, val): if val not in self.elements: self.elements[ind] = val def __delitem__(self, ind): del self.elements[ind] def list(self): return self.elements def append(self, new): if new not in self.elements: self.elements.append(new) def extend(self, new): self.elements.extend(new) self.elements = setify(self.elements) def count(self, x): self.elements.count(x) def index(self, x): self.elements.index(x) def insert(self, i, x): if x not in self.elements: self.elements.index(i, x) def pop(self, i=None): self.elements.pop(i) def remove(self, x): self.elements.remove(x) def reverse(self): self.elements.reverse() def sort(self, cmp=None): self.elements.sort(cmp) def __or__(self, other): if type(other) == type(self): other = other.elements return set(union(self.elements, other)) __add__ = __or__ def __and__(self, other): if type(other) == type(self): other = other.elements return set(intersection(self.elements, other)) def __sub__(self, other): if type(other) == type(self): other = other.elements return set(difference(self.elements, other)) def __xor__(self, other): if type(other) == type(self): other = other.elements return set(symmetric_difference(self.elements, other)) def __mul__(self, other): if type(other) == type(self): other = other.elements return set(cartesian(self.elements, other)) def __eq__(self, other): if type(other) == type(self): other = other.elements return self.elements == other def __ne__(self, other): if type(other) == type(self): other = other.elements return self.elements != other def __lt__(self, other): if type(other) == type(self): other = other.elements return proper_subset(self.elements, other) def __le__(self, other): if type(other) == type(self): other = other.elements return subset(self.elements, other) def __gt__(self, other): if type(other) == type(self): other = other.elements return proper_subset(other, self.elements) def __ge__(self, other): if type(other) == type(self): other = other.elements return subset(other, self.elements) def __str__(self): res = "{" for x in self.elements: res = res + str(x) + ", " res = res[0:-2] + "}" return res def __repr__(self): return repr(self.elements) if __name__ == '__main__': a = set([1, 2, 3, 4]) b = set([1, 4]) c = set([5, 6]) d = [1, 1, 2, 1] print `d`, "setifies to", set(d) print `a`, "|", `b`, "is", `a | b` print `a`, "^", `b`, "is", `a ^ b` print `a`, "&", `b`, "is", `a & b` print `b`, "*", `c`, "is", `b * c` print `a`, '<', `b`, "is", `a < b` print `a`, '>', `b`, "is", `a > b` print `b`, '<', `c`, "is", `b < c` print `b`, '>', `c`, "is", `b > c` print "Power set of", `c`, "is", powerset(c) # end --tKW2IUtsqtDRztdT--
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4