A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2000-March.txt below:

From gstein@lyra.org Wed Mar 1 00:12:29 2000 From: gstein@lyra.org (Greg Stein) Date: Tue, 29 Feb 2000 16:12:29 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Wed, 1 Mar 2000, Mark Hammond wrote: > > Why don't we simply move forward with the assumption that PythonWin and > > Scintilla will be updated? > > Done :-) hehe... > However, I think dropping it now _is_ a little heavy handed. I decided to > do a wider search and found a few in, eg, Sam Rushings calldll based ODBC > package. > > Personally, I would much prefer a warning now, and drop it later. _Then_ we > can say we have made enough noise about it. It would only be 2 years ago > that I became aware that this "feature" of append was not a feature at all - > up until then I used it purposely, and habits are sometimes hard to change > :-) What's the difference between a warning and an error? If you're running a program and it suddenly spits out a warning about a misuse of list.append, I'd certainly see that as "the program did something unexpected; that is an error." But this is all moot. Guido has already said that we would be amenable to a warning/error infrastructure which list.append could use. His description used some awkward sentences, so I'm not sure (without spending some brain cycles to parse the email) exactly what his desired defaults and behavior are. But hey... the possibility is there, and is just waiting for somebody to code it. IMO, Guido has left an out for people that are upset with the current hard-line approach. One of those people just needs to spend a bit of time coming up with a patch :-) And yes, Guido is also the Benevolent Dictator and can certainly have his mind changed, so people can definitely continue pestering him to back away from the hard-line approach... Cheers, -g -- Greg Stein, http://www.lyra.org/ From ping@lfw.org Wed Mar 1 00:20:07 2000 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 29 Feb 2000 18:20:07 -0600 (CST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Tue, 29 Feb 2000, Greg Stein wrote: > > What's the difference between a warning and an error? If you're running a > program and it suddenly spits out a warning about a misuse of list.append, > I'd certainly see that as "the program did something unexpected; that is > an error." A big, big difference. Perhaps to one of us, it's the minor inconvenience of reading the error message and inserting a couple of parentheses in the appropriate file -- but to the end user, it's the difference between the program working (albeit noisily) and *not* working. When the program throws an exception and stops, it is safe to say most users will declare it broken and give up. We can't assume that they're going to be able to figure out what to edit (or be brave enough to try) just by reading the error message... or even what interpreter flag to give, if errors (rather than warnings) are the default behaviour. -- ?!ng From klm@digicool.com Wed Mar 1 00:37:09 2000 From: klm@digicool.com (Ken Manheimer) Date: Tue, 29 Feb 2000 19:37:09 -0500 (EST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Wed, 1 Mar 2000, Mark Hammond wrote: > > Why don't we simply move forward with the assumption that PythonWin and > > Scintilla will be updated? > > Done :-) > > However, I think dropping it now _is_ a little heavy handed. I decided to > do a wider search and found a few in, eg, Sam Rushings calldll based ODBC > package. > > Personally, I would much prefer a warning now, and drop it later. _Then_ we > can say we have made enough noise about it. It would only be 2 years ago > that I became aware that this "feature" of append was not a feature at all - > up until then I used it purposely, and habits are sometimes hard to change > :-) I agree with mark. Why the sudden rush?? It seems to me to be unfair to make such a change - one that will break peoples code - without advanced warning, which typically is handled by a deprecation period. There *are* going to be people who won't be informed of the change in the short span of less than a single release. Just because it won't cause you pain isn't a good reason to disregard the pain of those that will suffer, particularly when you can do something relatively low-cost to avoid it. Ken klm@digicool.com From gstein@lyra.org Wed Mar 1 00:57:56 2000 From: gstein@lyra.org (Greg Stein) Date: Tue, 29 Feb 2000 16:57:56 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Tue, 29 Feb 2000, Ken Manheimer wrote: >... > I agree with mark. Why the sudden rush?? It seems to me to be unfair to > make such a change - one that will break peoples code - without advanced > warning, which typically is handled by a deprecation period. There *are* > going to be people who won't be informed of the change in the short span > of less than a single release. Just because it won't cause you pain isn't > a good reason to disregard the pain of those that will suffer, > particularly when you can do something relatively low-cost to avoid it. Sudden rush?!? Mark said he knew about it for a couple years. Same here. It was a long while ago that .append()'s semantics were specified to "no longer" accept multiple arguments. I see in the HISTORY file, that changes were made to Python 1.4 (October, 1996) to avoid calling append() with multiple arguments. So, that is over three years that append() has had multiple-args deprecated. There was probably discussion even before that, but I can't seem to find something to quote. Seems like plenty of time -- far from rushed. Cheers, -g -- Greg Stein, http://www.lyra.org/ From klm@digicool.com Wed Mar 1 01:02:02 2000 From: klm@digicool.com (Ken Manheimer) Date: Tue, 29 Feb 2000 20:02:02 -0500 (EST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Tue, 29 Feb 2000, Greg Stein wrote: > On Tue, 29 Feb 2000, Ken Manheimer wrote: > >... > > I agree with mark. Why the sudden rush?? It seems to me to be unfair to > > make such a change - one that will break peoples code - without advanced > > warning, which typically is handled by a deprecation period. There *are* > > going to be people who won't be informed of the change in the short span > > of less than a single release. Just because it won't cause you pain isn't > > a good reason to disregard the pain of those that will suffer, > > particularly when you can do something relatively low-cost to avoid it. > > Sudden rush?!? > > Mark said he knew about it for a couple years. Same here. It was a long > while ago that .append()'s semantics were specified to "no longer" accept > multiple arguments. > > I see in the HISTORY file, that changes were made to Python 1.4 (October, > 1996) to avoid calling append() with multiple arguments. > > So, that is over three years that append() has had multiple-args > deprecated. There was probably discussion even before that, but I can't > seem to find something to quote. Seems like plenty of time -- far from > rushed. None the less, for those practicing it, the incorrectness of it will be fresh news. I would be less sympathetic with them if there was recent warning, eg, the schedule for changing it in the next release was part of the current release. But if you tell somebody you're going to change something, and then don't for a few years, you probably need to renew the warning before you make the change. Don't you think so? Why not? Ken klm@digicool.com From paul@prescod.net Wed Mar 1 02:56:33 2000 From: paul@prescod.net (Paul Prescod) Date: Tue, 29 Feb 2000 18:56:33 -0800 Subject: [Python-Dev] breaking list.append() References: Message-ID: <38BC86E1.53F69776@prescod.net> Software configuration management is HARD. Every sudden backwards incompatible change (warranted or not) makes it harder. Mutli-arg append is not hurting anyone as much as a sudden change to it would. It would be better to leave append() alone and publicize its near-term removal rather than cause random, part-time supported modules to stop working because their programmers may be too busy to update them right now. So no, I'm not stepping up to do it. But I'm also saying that the better "lazy" option is to put something in a prominent place in the documentation and otherwise leave it alone. As far as I am concerned, a formal warning-based deprecation mechanism is necessary for Python's continued evolution. Perhaps we can even expose the deprecation flag to the programmer so we can say: if deprecation: print "This module isn't supported anymore." if deprecation: print "Use method FooEx instead." If we had a deprecation mechanism, maybe introducing new keywords would not be quite so painful. Version x deprecates, version y adds the keyword. Mayhap we should also deprecate implicit truncating integral division while we are at it... -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "The calculus and the rich body of mathematical analysis to which it gave rise made modern science possible, but it was the algorithm that made possible the modern world." - from "Advent of the Algorithm" David Berlinski http://www.opengroup.com/mabooks/015/0151003386.shtml From guido@python.org Wed Mar 1 04:11:02 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 29 Feb 2000 23:11:02 -0500 Subject: [Python-Dev] breaking list.append() In-Reply-To: Your message of "Tue, 29 Feb 2000 18:56:33 PST." <38BC86E1.53F69776@prescod.net> References: <38BC86E1.53F69776@prescod.net> Message-ID: <200003010411.XAA12988@eric.cnri.reston.va.us> > Software configuration management is HARD. Every sudden backwards > incompatible change (warranted or not) makes it harder. Mutli-arg append > is not hurting anyone as much as a sudden change to it would. It would > be better to leave append() alone and publicize its near-term removal > rather than cause random, part-time supported modules to stop working > because their programmers may be too busy to update them right now. I'm tired of this rhetoric. It's not like I'm changing existing Python installations retroactively. I'm planning to release a new version of Python which no longer supports certain long-obsolete and undocumented behavior. If you maintain a non-core Python module, you should test it against the new release and fix anything that comes up. This is why we have an alpha and beta test cycle and even before that the CVS version. If you are a Python user who depends on a 3rd party module, you need to find out whether the new version is compatible with the 3rd party code you are using, or whether there's a newer version available that solves the incompatibility. There are people who still run Python 1.4 (really!) because they haven't upgraded. I don't have a problem with that -- they don't get much support, but it's their choice, and they may not need the new features introduced since then. I expect that lots of people won't upgrade their Python 1.5.2 to 1.6 right away -- they'll wait until the other modules/packages they need are compatible with 1.6. Multi-arg append probably won't be the only reason why e.g. Digital Creations may need to release an update to Zope for Python 1.6. Zope comes with its own version of Python anyway, so they have control over when they make the switch. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Wed Mar 1 05:04:35 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 00:04:35 -0500 Subject: [Python-Dev] Size of int across machines (was RE: Blowfish in Python?) In-Reply-To: Message-ID: <000201bf833b$a3b01bc0$412d153f@tim> [Markus Stenberg] > ... > speed was horrendous. > > I think the main reason was the fact that I had to use _long ints_ for > calculations, as the normal ints are signed, and apparently the bitwise > operators do not work as advertised when bit32 is set (=number is > negative). [Tim, takes "bitwise operators" to mean & | ^ ~, and expresses surprise] [Markus, takes umbrage, and expresses umbrage ] > Hmm.. As far as I'm concerned, shifts for example do screw up. Do you mean "for example" as in "there are so many let's just pick one at random", or as in "this is the only one I've stumbled into" <0.9 wink>? > i.e. > > 0xffffffff >> 30 > > [64bit Python: 3] > [32bit Python: -1] > > As far as I'm concerned, that should _not_ happen. Or maybe it's just me. I could not have guessed that your complaint was about 64-bit Python from your "when bit32 is set (=number is negative)" description . The behavior shown in a Python compiled under a C in which sizeof(long)==4 matches the Reference Manual (see the "Integer and long integer literals" and "shifting operations" sections). So that can't be considered broken (you may not *like* it, but it's functioning as designed & as documented). The behavior under a sizeof(long)==8 C seems more of an ill-documented (and debatable to me too) feature. The possibility is mentioned in the "The standard type hierarchy" section (under Numbers -> Integers -> Plain integers) but really not fleshed out, and the "Integer and long integer literals" section plainly contradicts it. Python's going to have to clean up its act here -- 64-bit machines are getting more common. There's a move afoot to erase the distinction between Python ints and longs (in the sense of auto-converting from one to the other under the covers, as needed). In that world, your example would work like the "64bit Python" one. There are certainly compatability issues, though, in that int left shifts are end-off now, and on a 32-bit machine any int for which i & 0x8000000 is true "is negative" (and so sign-extends on a right shift; note that Python guarantees sign-extending right shifts *regardless* of what the platform C does (C doesn't define what happens here -- Python does)). [description of pain getting a fast C-like "mod 2**32 int +" to work too] Python really wasn't designed for high-performance bit-fiddling, so you're (as you've discovered ) swimming upstream with every stroke. Given that you can't write a C module here, there's nothing better than to do the ^ & | ~ parts with ints, and fake the rest slowly & painfully. Note that you can at least determine the size of a Python int via inspecting sys.maxint. sympathetically-unhelpfully y'rs - tim From guido@python.org Wed Mar 1 05:44:10 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Mar 2000 00:44:10 -0500 Subject: [Python-Dev] Re: [Patches] Reference cycle collection for Python In-Reply-To: Your message of "Tue, 29 Feb 2000 15:34:21 MST." <20000229153421.A16502@acs.ucalgary.ca> References: <20000229153421.A16502@acs.ucalgary.ca> Message-ID: <200003010544.AAA13155@eric.cnri.reston.va.us> [I don't like to cross-post to patches and python-dev, but I think this belongs in patches because it's a followup to Neil's post there and also in -dev because of its longer-term importance.] Thanks for the new patches, Neil! We had a visitor here at CNRI today, Eric Tiedemann , who had a look at your patches before. Eric knows his way around the Scheme, Lisp and GC literature, and presented a variant on your approach which takes the bite out of the recursive passes. Eric had commented earlier on Neil's previous code, and I had used the morning to make myself familiar with Neil's code. This was relatively easy because Neil's code is very clear. Today, Eric proposed to do away with Neil's hash table altogether -- as long as we're wasting memory, we might as well add 3 fields to each container object rather than allocating the same amount in a separate hash table. Eric expects that this will run faster, although this obviously needs to be tried. Container types are: dict, list, tuple, class, instance; plus potentially user-defined container types such as kjbuckets. I have a feeling that function objects should also be considered container types, because of the cycle involving globals. Eric's algorithm, then, consists of the following parts. Each container object has three new fields: gc_next, gc_prev, and gc_refs. (Eric calls the gc_refs "refcount-zero".) We color objects white (initial), gray (root), black (scanned root). (The terms are explained later; we believe we don't actually need bits in the objects to store the color; see later.) All container objects are chained together in a doubly-linked list -- this is the same as Neil's code except Neil does it only for dicts. (Eric postulates that you need a list header.) When GC is activated, all objects are colored white; we make a pass over the entire list and set gc_refs equal to the refcount for each object. Next, we make another pass over the list to collect the internal references. Internal references are (just like in Neil's version) references from other container types. In Neil's version, this was recursive; in Eric's version, we don't need recursion, since the list already contains all containers. So we simple visit the containers in the list in turn, and for each one we go over all the objects it references and subtract one from *its* gc_refs field. (Eric left out the little detail that we ened to be able to distinguish between container and non-container objects amongst those references; this can be a flag bit in the type field.) Now, similar to Neil's version, all objects for which gc_refs == 0 have only internal references, and are potential garbage; all objects for which gc_refs > 0 are "roots". These have references to them from other places, e.g. from globals or stack frames in the Python virtual machine. We now start a second list, to which we will move all roots. The way to do this is to go over the first list again and to move each object that has gc_refs > 0 to the second list. Objects placed on the second list in this phase are considered colored gray (roots). Of course, some roots will reference some non-roots, which keeps those non-roots alive. We now make a pass over the second list, where for each object on the second list, we look at every object it references. If a referenced object is a container and is still in the first list (colored white) we *append* it to the second list (colored gray). Because we append, objects thus added to the second list will eventually be considered by this same pass; when we stop finding objects that sre still white, we stop appending to the second list, and we will eventually terminate this pass. Conceptually, objects on the second list that have been scanned in this pass are colored black (scanned root); but there is no need to to actually make the distinction. (How do we know whether an object pointed to is white (in the first list) or gray or black (in the second)? We could use an extra bitfield, but that's a waste of space. Better: we could set gc_refs to a magic value (e.g. 0xffffffff) when we move the object to the second list. During the meeting, I proposed to set the back pointer to NULL; that might work too but I think the gc_refs field is more elegant. We could even just test for a non-zero gc_refs field; the roots moved to the second list initially all have a non-zero gc_refs field already, and for the objects with a zero gc_refs field we could indeed set it to something arbitrary.) Once we reach the end of the second list, all objects still left in the first list are garbage. We can destroy them in a similar to the way Neil does this in his code. Neil calls PyDict_Clear on the dictionaries, and ignores the rest. Under Neils assumption that all cycles (that he detects) involve dictionaries, that is sufficient. In our case, we may need a type-specific "clear" function for containers in the type object. We discussed more things, but not as thoroughly. Eric & Eric stressed the importance of making excellent statistics available about the rate of garbage collection -- probably as data structures that Python code can read rather than debugging print statements. Eric T also sketched an incremental version of the algorithm, usable for real-time applications. This involved keeping the gc_refs field ("external" reference counts) up-to-date at all times, which would require two different versions of the INCREF/DECREF macros: one for adding/deleting a reference from a container, and another for adding/deleting a root reference. Also, a 4th color (red) was added, to distinguish between scanned roots and scanned non-roots. We decided not to work this out in more detail because the overhead cost appeared to be much higher than for the previous algorithm; instead, we recommed that for real-time requirements the whole GC is disabled (there should be run-time controls for this, not just compile-time). We also briefly discussed possibilities for generational schemes. The general opinion was that we should first implement and test the algorithm as sketched above, and then changes or extensions could be made. I was pleasantly surprised to find Neil's code in my inbox when we came out of the meeting; I think it would be worthwhile to compare and contrast the two approaches. (Hm, maybe there's a paper in it?) The rest of the afternoon was spent discussing continuations, coroutines and generators, and the fundamental reason why continuations are so hard (the C stack getting in the way everywhere). But that's a topic for another mail, maybe. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Wed Mar 1 05:57:49 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 00:57:49 -0500 Subject: need .append patch (was RE: [Python-Dev] Re: Python-checkins digest, Vol 1 #370 - 8 msgs) In-Reply-To: <200002291302.IAA04581@eric.cnri.reston.va.us> Message-ID: <000601bf8343$13575040$412d153f@tim> [Tim, runs checkappend.py over the entire CVS tree, comes up with surprisingly many remaining problems, and surprisingly few false hits] [Guido fixes mailerdaemon.py, and argues for nuking Demo\tkinter\www\ (the whole directory) Demo\sgi\video\VcrIndex.py (unclear whether the dir or just the file) Demo\sgi\gl\glstdwin\glstdwin.py (stdwin-related) Demo\ibrowse\ibrowse.py (stdwin-related) > All these are stdwin-related. Stdwin will also go out of service per > 1.6. ] Then the sooner someone nukes them from the CVS tree, the sooner my automated hourly checkappend complaint generator will stop pestering Python-Dev about them . > (Conclusion: most multi-arg append() calls are *very* old, But part of that is because we went thru this exercise a couple years ago too, and you repaired all the ones in the less obscure parts of the distribution then. > or contributed by others. Sigh. I must've given bad examples long > ago...) Na, I doubt that. Most people will not read a language defn, at least not until "something doesn't work". If the compiler accepts a thing, they simply *assume* it's correct. It's pretty easy (at least for me!) to make this particular mistake as a careless typo, so I assume that's the "source origin" for many of these too. As soon you *notice* you've done it, and that nothing bad happened, the natural tendencies are to (a) believe it's OK, and (b) save 4 keystrokes (incl. the SHIFTs) over & over again in the glorious indefinite future . Reminds me of a c.l.py thread a while back, wherein someone did stuff like None, x, y, None = function_returning_a_4_tuple to mean that they didn't care what the 1st & 4th values were. It happened to work, so they did it more & more. Eventually a function containing this mistake needed to reference None after that line, and "suddenly for no reason at all Python stopped working". To the extent that you're serious about CP4E, you're begging for more of this, not less . newbies-even-keep-on-doing-things-that-*don't*-work!-ly y'rs - tim From tim_one@email.msn.com Wed Mar 1 06:50:44 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 01:50:44 -0500 Subject: [Python-Dev] Unicode mapping tables In-Reply-To: <38BBD1A2.CD29AADD@lemburg.com> Message-ID: <000701bf834a$77acdfe0$412d153f@tim> [M.-A. Lemburg] > ... > Currently, mapping tables map characters to Unicode characters > and vice-versa. Now the .translate method will use a different > kind of table: mapping integer ordinals to integer ordinals. You mean that if I want to map u"a" to u"A", I have to set up some sort of dict mapping ord(u"a") to ord(u"A")? I simply couldn't follow this. > Question: What is more of efficient: having lots of integers > in a dictionary or lots of characters ? My bet is "lots of integers", to reduce both space use and comparison time. > ... > Something else that changed is the way .capitalize() works. The > Unicode version uses the Unicode algorithm for it (see TechRep. 13 > on the www.unicode.org site). #13 is "Unicode Newline Guidelines". I assume you meant #21 ("Case Mappings"). > Here's the new doc string: > > S.capitalize() -> unicode > > Return a capitalized version of S, i.e. words start with title case > characters, all remaining cased characters have lower case. > > Note that *all* characters are touched, not just the first one. > The change was needed to get it in sync with the .iscapitalized() > method which is based on the Unicode algorithm too. > > Should this change be propogated to the string implementation ? Unicode makes distinctions among "upper case", "lower case" and "title case", and you're trying to get away with a single "capitalize" function. Java has separate toLowerCase, toUpperCase and toTitleCase methods, and that's the way to do it. Whatever you do, leave .capitalize alone for 8-bit strings -- there's no reason to break code that currently works. "capitalize" seems a terrible choice of name for a titlecase method anyway, because of its baggage connotations from 8-bit strings. Since this stuff is complicated, I say it would be much better to use the same names for these things as the Unicode and Java folk do: there's excellent documentation elsewhere for all this stuff, and it's Bad to make users mentally translate unique Python terminology to make sense of the official docs. So my vote is: leave capitalize the hell alone . Do not implement capitialize for Unicode strings. Introduce a new titlecase method for Unicode strings. Add a new titlecase method to 8-bit strings too. Unicode strings should also have methods to get at uppercase and lowercase (as Unicode defines those). From tim_one@email.msn.com Wed Mar 1 07:36:03 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 02:36:03 -0500 Subject: [Python-Dev] Re: Python / Haskell (fwd) In-Reply-To: Message-ID: <000801bf8350$cc4ec580$412d153f@tim> [Greg Wilson, quoting Philip Wadler] > Well, what I most want is typing. But you already know that. So invite him to contribute to the Types-SIG <0.5 wink>. > Next after typing? Full lexical scoping for closures. I want to write: > > fun x: fun y: x+y > > Not: > > fun x: fun y, x=x: x+y > > Lexically scoped closures would be a big help for the embedding technique > I described [GVW: in a posting to the Software Carpentry discussion list, > archived at > > http://software-carpentry.codesourcery.com/lists/sc-discuss/msg00068.html > > which discussed how to build a flexible 'make' alternative in Python]. So long as we're not deathly concerned over saving a few lines of easy boilerplate code, Python already supports this approach wonderfully well -- but via using classes with __call__ methods instead of lexical closures. I can't make time to debate this now, but suffice it to say dozens on c.l.py would be delighted to . Philip is understandably attached to the "functional way of spelling things", but Python's way is at least as usable for this (and many-- including me --would say more so). > Next after closures? Disjoint sums. E.g., > > fun area(shape) : > switch shape: > case Circle(r): > return pi*r*r > case Rectangle(h,w): > return h*w > > (I'm making up a Python-like syntax.) This is an alternative to the OO > approach. With the OO approach, it is hard to add area, unless you modify > the Circle and Rectangle class definitions. Python allows adding new methods to classes dynamically "from the outside" -- the original definitions don't need to be touched (although it's certainly preferable to add new methods directly!). Take this complaint to the extreme, and I expect you end up reinventing multimethods (suppose you need to add an intersection(shape1, shape2) method: N**2 nesting of "disjoint sums" starts to appear ludicrous ). In any case, the Types-SIG already seems to have decided that some form of "typecase" stmt will be needed; see the archives for that; I expect the use above would be considered abuse, though; Python has no "switch" stmt of any kind today, and the use above can already be spelled via if isinstance(shape, Circle): etc elif isinstace(shape, Rectange): etc else: raise TypeError(etc) From gstein@lyra.org Wed Mar 1 07:51:29 2000 From: gstein@lyra.org (Greg Stein) Date: Tue, 29 Feb 2000 23:51:29 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Tue, 29 Feb 2000, Ken Manheimer wrote: >... > None the less, for those practicing it, the incorrectness of it will be > fresh news. I would be less sympathetic with them if there was recent > warning, eg, the schedule for changing it in the next release was part of > the current release. But if you tell somebody you're going to change > something, and then don't for a few years, you probably need to renew the > warning before you make the change. Don't you think so? Why not? I agree. Note that Guido posted a note to c.l.py on Monday. I believe that meets your notification criteria. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Mar 1 08:10:28 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 1 Mar 2000 00:10:28 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: <200003010411.XAA12988@eric.cnri.reston.va.us> Message-ID: On Tue, 29 Feb 2000, Guido van Rossum wrote: > I'm tired of this rhetoric. It's not like I'm changing existing > Python installations retroactively. I'm planning to release a new > version of Python which no longer supports certain long-obsolete and > undocumented behavior. If you maintain a non-core Python module, you > should test it against the new release and fix anything that comes up. > This is why we have an alpha and beta test cycle and even before that > the CVS version. If you are a Python user who depends on a 3rd party > module, you need to find out whether the new version is compatible > with the 3rd party code you are using, or whether there's a newer > version available that solves the incompatibility. > > There are people who still run Python 1.4 (really!) because they > haven't upgraded. I don't have a problem with that -- they don't get > much support, but it's their choice, and they may not need the new > features introduced since then. I expect that lots of people won't > upgrade their Python 1.5.2 to 1.6 right away -- they'll wait until the > other modules/packages they need are compatible with 1.6. Multi-arg > append probably won't be the only reason why e.g. Digital Creations > may need to release an update to Zope for Python 1.6. Zope comes with > its own version of Python anyway, so they have control over when they > make the switch. I wholeheartedly support his approach. Just ask Mark Hammond :-) how many times I've said "let's change the code to make it Right; people aren't required to upgrade [and break their code]." Of course, his counter is that people need to upgrade to fix other, unrelated problems. So I relax and try again later :-). But I still maintain that they can independently grab the specific fixes and leave the other changes we make. Maybe it is grey, but I think this change is quite fine. Especially given Tim's tool. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Wed Mar 1 08:22:06 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 03:22:06 -0500 Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: <000b01bf8357$3af08d60$412d153f@tim> [Greg Stein] > ... > Maybe it is grey, but I think this change is quite fine. Especially given > Tim's tool. What the heck does Tim's one-eyed trouser snake have to do with this? I know *it* likes to think it's the measure of all things, but, frankly, my tool barely affects the world at all a mere two feet beyond its base . tim-and-his-tool-think-the-change-is-a-mixed-thing-but-on-balance- the-best-thing-ly y'rs - tim From Fredrik Lundh" Message-ID: <00fb01bf8359$c8196a20$34aab5d4@hagrid> Greg Stein wrote: > Note that Guido posted a note to c.l.py on Monday. I believe that = meets > your notification criteria. ahem. do you seriously believe that everyone in the Python universe reads comp.lang.python? afaik, most Python programmers don't. ... so as far as I'm concerned, this was officially deprecated with Guido's post. afaik, no official python documentation has explicitly mentioned this (and the fact that it doesn't explicitly allow it doesn't really matter, since the docs don't explicitly allow the x[a, b, c] syntax either. both work in 1.5.2). has anyone checked the recent crop of Python books, btw? the eff-bot guide uses old syntax in two examples out of 320. how about the others? ... sigh. running checkappend over a 50k LOC application, I just realized that it doesn't catch a very common append pydiom. =20 how fun. even though 99% of all append calls are "legal", this "minor" change will break every single application and library we have :-( oh, wait. xmlrpclib isn't affected. always something! From gstein@lyra.org Wed Mar 1 08:43:02 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 1 Mar 2000 00:43:02 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: <00fb01bf8359$c8196a20$34aab5d4@hagrid> Message-ID: On Wed, 1 Mar 2000, Fredrik Lundh wrote: > Greg Stein wrote: > > Note that Guido posted a note to c.l.py on Monday. I believe that meets > > your notification criteria. > > ahem. do you seriously believe that everyone in the > Python universe reads comp.lang.python? > > afaik, most Python programmers don't. Now you're simply taking my comments out of context. Not a proper thing to do. Ken said that he wanted notification along certain guidelines. I said that I believed Guido's post did just that. Period. Personally, I think it is fine. I also think that a CHANGES file that arrives with 1.6 that points out the incompatibility is also fine. >... > sigh. running checkappend over a 50k LOC application, I > just realized that it doesn't catch a very common append > pydiom. And which is that? Care to help out? Maybe just a little bit? Or do you just want to talk about how bad this change is? :-( Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Mar 1 09:01:52 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 1 Mar 2000 01:01:52 -0800 (PST) Subject: [Python-Dev] breaking list.append() In-Reply-To: <000b01bf8357$3af08d60$412d153f@tim> Message-ID: On Wed, 1 Mar 2000, Tim Peters wrote: > [Greg Stein] > > ... > > Maybe it is grey, but I think this change is quite fine. Especially given > > Tim's tool. > > What the heck does Tim's one-eyed trouser snake have to do with this? I > know *it* likes to think it's the measure of all things, but, frankly, my > tool barely affects the world at all a mere two feet beyond its base . > > tim-and-his-tool-think-the-change-is-a-mixed-thing-but-on-balance- > the-best-thing-ly y'rs - tim Heh. Now how is one supposed to respond to *that* ??! All right. Fine. +3 cool points go to Tim. :-) -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Mar 1 09:03:32 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 1 Mar 2000 01:03:32 -0800 (PST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src Makefile.in,1.82,1.83 In-Reply-To: <14523.56638.286603.340358@weyr.cnri.reston.va.us> Message-ID: On Tue, 29 Feb 2000, Fred L. Drake, Jr. wrote: > Guido van Rossum writes: > > You can already extract this from the updated documetation on the > > website (which has a list of obsolete modules). > > > > But you're righ,t it would be good to be open about this. I'll think > > about it. > > Note that the updated documentation isn't yet "published"; there are > no links to it and it hasn't been checked as much as I need it to be > before announcing it. Isn't the documentation better than what has been released? In other words, if you release now, how could you make things worse? If something does turn up during a check, you can always release again... Cheers, -g -- Greg Stein, http://www.lyra.org/ From Fredrik Lundh" Message-ID: <011001bf835e$600d1da0$34aab5d4@hagrid> Greg Stein wrote: > On Wed, 1 Mar 2000, Fredrik Lundh wrote: > > Greg Stein wrote: > > > Note that Guido posted a note to c.l.py on Monday. I believe that = meets > > > your notification criteria. > >=20 > > ahem. do you seriously believe that everyone in the > > Python universe reads comp.lang.python? > >=20 > > afaik, most Python programmers don't. >=20 > Now you're simply taking my comments out of context. Not a proper = thing to > do. Ken said that he wanted notification along certain guidelines. I = said > that I believed Guido's post did just that. Period. my point was that most Python programmers won't see that notification. when these people download 1.6 final and find that all theirs apps just broke, they probably won't be happy with a pointer to dejanews. > And which is that? Care to help out? Maybe just a little bit? this rather common pydiom: append =3D list.append for x in something: append(...) it's used a lot where performance matters. > Or do you just want to talk about how bad this change is? :-( yes, I think it's bad. I've been using Python since 1.2, and no other change has had the same consequences (wrt. time/money required to fix it) call me a crappy programmer if you want, but I'm sure there are others out there who are nearly as bad. and lots of them won't be aware of this change until some- one upgrades the python interpreter on their server. From mal@lemburg.com Wed Mar 1 08:38:52 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 01 Mar 2000 09:38:52 +0100 Subject: [Python-Dev] Unicode mapping tables References: <000701bf834a$77acdfe0$412d153f@tim> Message-ID: <38BCD71C.3592E6A@lemburg.com> Tim Peters wrote: > > [M.-A. Lemburg] > > ... > > Currently, mapping tables map characters to Unicode characters > > and vice-versa. Now the .translate method will use a different > > kind of table: mapping integer ordinals to integer ordinals. > > You mean that if I want to map u"a" to u"A", I have to set up some sort of > dict mapping ord(u"a") to ord(u"A")? I simply couldn't follow this. I meant: 'a': u'A' vs. ord('a'): ord(u'A') The latter wins ;-) Reasoning for the first was that it allows character sequences to be handled by the same mapping algorithm. I decided to leave those techniques to some future implementation, since mapping integers has the nice side-effect of also allowing sequences to be used as mapping tables... resulting in some speedup at the cost of memory consumption. BTW, there are now three different ways to do char translations: 1. char -> unicode (char mapping codec's decode) 2. unicode -> char (char mapping codec's encode) 3. unicode -> unicode (unicode's .translate() method) > > Question: What is more of efficient: having lots of integers > > in a dictionary or lots of characters ? > > My bet is "lots of integers", to reduce both space use and comparison time. Right. That's what I found too... it's "lots of integers" now :-) > > ... > > Something else that changed is the way .capitalize() works. The > > Unicode version uses the Unicode algorithm for it (see TechRep. 13 > > on the www.unicode.org site). > > #13 is "Unicode Newline Guidelines". I assume you meant #21 ("Case > Mappings"). Dang. You're right. Here's the URL in case someone wants to join in: http://www.unicode.org/unicode/reports/tr21/tr21-2.html > > Here's the new doc string: > > > > S.capitalize() -> unicode > > > > Return a capitalized version of S, i.e. words start with title case > > characters, all remaining cased characters have lower case. > > > > Note that *all* characters are touched, not just the first one. > > The change was needed to get it in sync with the .iscapitalized() > > method which is based on the Unicode algorithm too. > > > > Should this change be propogated to the string implementation ? > > Unicode makes distinctions among "upper case", "lower case" and "title > case", and you're trying to get away with a single "capitalize" function. > Java has separate toLowerCase, toUpperCase and toTitleCase methods, and > that's the way to do it. The Unicode implementation has the corresponding: .upper(), .lower() and .capitalize() They work just like .toUpperCase, .toLowerCase, .toTitleCase resp. (well at least they should ;). > Whatever you do, leave .capitalize alone for 8-bit > strings -- there's no reason to break code that currently works. > "capitalize" seems a terrible choice of name for a titlecase method anyway, > because of its baggage connotations from 8-bit strings. Since this stuff is > complicated, I say it would be much better to use the same names for these > things as the Unicode and Java folk do: there's excellent documentation > elsewhere for all this stuff, and it's Bad to make users mentally translate > unique Python terminology to make sense of the official docs. Hmm, that's an argument but it breaks the current method naming scheme of all lowercase letter. Perhaps I should simply provide a new method for .toTitleCase(), e.g. .title(), and leave the previous definition of .capitalize() intact... > So my vote is: leave capitalize the hell alone . Do not implement > capitialize for Unicode strings. Introduce a new titlecase method for > Unicode strings. Add a new titlecase method to 8-bit strings too. Unicode > strings should also have methods to get at uppercase and lowercase (as > Unicode defines those). ...looks like you're more or less on the same wave length here ;-) Here's what I'll do: * implement .capitalize() in the traditional way for Unicode objects (simply convert the first char to uppercase) * implement u.title() to mean the same as Java's toTitleCase() * don't implement s.title(): the reasoning here is that it would confuse the user when she get's different return values for the same string (titlecase chars usually live in higher Unicode code ranges not reachable in Latin-1) Thanks for the feedback, -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim_one@email.msn.com Wed Mar 1 10:06:58 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 05:06:58 -0500 Subject: [Python-Dev] breaking list.append() In-Reply-To: <00fb01bf8359$c8196a20$34aab5d4@hagrid> Message-ID: <000e01bf8365$e1e0b9c0$412d153f@tim> [/F] > ... > so as far as I'm concerned, this was officially deprecated > with Guido's post. afaik, no official python documentation > has explicitly mentioned this (and the fact that it doesn't > explicitly allow it doesn't really matter, since the docs don't > explicitly allow the x[a, b, c] syntax either. both work in > 1.5.2). The "Subscriptions" section of the Reference Manual explicitly allows for dict[a, b, c] and explicitly does not allow for sequence[a, b, c] The "Mapping Types" section of the Library Ref does not explicitly allow for it, though, and if you read it as implicitly allowing for it (based on the Reference Manual's clarification of "key" syntax), you would also have to read the Library Ref as allowing for dict.has_key(a, b, c) Which 1.5.2 does allow, but which Guido very recently patched to treat as a syntax error. > ... > sigh. running checkappend over a 50k LOC application, I > just realized that it doesn't catch a very common append > pydiom. [And, later, after prodding by GregS] > this rather common pydiom: > > append = list.append > for x in something: > append(...) This limitation was pointed out in checkappend's module docstring. Doesn't make it any easier for you to swallow, but I needed to point out that you didn't *have* to stumble into this the hard way . > how fun. even though 99% of all append calls are "legal", > this "minor" change will break every single application and > library we have :-( > > oh, wait. xmlrpclib isn't affected. always something! What would you like to do, then? The code will be at least as broken a year from now, and probably more so -- unless you fix it. So this sounds like an indirect argument for never changing Python's behavior here. Frankly, I expect you could fix the 50K LOC in less time than it took me to write this naggy response <0.50K wink>. embrace-change-ly y'rs - tim From tim_one@email.msn.com Wed Mar 1 10:31:12 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 05:31:12 -0500 Subject: [Python-Dev] breaking list.append() In-Reply-To: <000e01bf8365$e1e0b9c0$412d153f@tim> Message-ID: <001001bf8369$453e9fc0$412d153f@tim> [Tim. needing sleep] > dict.has_key(a, b, c) > > Which 1.5.2 does allow, but which Guido very recently patched to > treat as a syntax error. No, a runtime error. haskeynanny.py, anyone? not-me-ly y'rs - tim From fredrik@pythonware.com Wed Mar 1 11:14:18 2000 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 1 Mar 2000 12:14:18 +0100 Subject: [Python-Dev] breaking list.append() References: <000e01bf8365$e1e0b9c0$412d153f@tim> Message-ID: <002101bf836f$4a012220$f29b12c2@secret.pythonware.com> Tim Peters wrote: > The "Subscriptions" section of the Reference Manual explicitly allows = for >=20 > dict[a, b, c] >=20 > and explicitly does not allow for >=20 > sequence[a, b, c] I'd thought we'd agreed that nobody reads the reference manual ;-) > What would you like to do, then? more time to fix it, perhaps? it's surely a minor code change, but fixing it can be harder than you think (just witness Gerrit's bogus patches) after all, python might be free, but more and more people are investing lots of money in using it [1]. > The code will be at least as broken a year > from now, and probably more so -- unless you fix it.=20 sure. we've already started. but it's a lot of work, and it's quite likely that it will take a while until we can be 100% confident that all the changes are pro- perly done. (not all software have a 100% complete test suite that simply says "yes, this works" or "no, it doesn't") 1) fwiw, some poor soul over here posted a short note to the pythonworks mailing, mentioning that we've now fixed the price. a major flamewar erupted, and my mail- box is now full of mail from unknowns telling me that I must be a complete moron that doesn't understand that Python is just a toy system, which everyone uses just be- cause they cannot afford anything better... From tim_one@email.msn.com Wed Mar 1 11:26:21 2000 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 1 Mar 2000 06:26:21 -0500 Subject: [Python-Dev] Re: [Patches] Reference cycle collection for Python In-Reply-To: <200003010544.AAA13155@eric.cnri.reston.va.us> Message-ID: <001101bf8370$f881dfa0$412d153f@tim> Very briefly: [Guido] > ... > Today, Eric proposed to do away with Neil's hash table altogether -- > as long as we're wasting memory, we might as well add 3 fields to each > container object rather than allocating the same amount in a separate > hash table. Eric expects that this will run faster, although this > obviously needs to be tried. No, it doesn't : it will run faster. > Container types are: dict, list, tuple, class, instance; plus > potentially user-defined container types such as kjbuckets. I > have a feeling that function objects should also be considered > container types, because of the cycle involving globals. Note that the list-migrating steps you sketch later are basically the same as (but hairier than) the ones JimF and I worked out for M&S-on-RC a few years ago, right down to using appending to effect a breadth-first traversal without requiring recursion -- except M&S doesn't have to bother accounting for sources of refcounts. Since *this* scheme does more work per item per scan, to be as fast in the end it has to touch less stuff than M&S. But the more kinds of types you track, the more stuff this scheme will have to chase. The tradeoffs are complicated & unclear, so I'll just raise an uncomfortable meta-point : you balked at M&S the last time around because of the apparent need for two link fields + a bit or two per object of a "chaseable type". If that's no longer perceived as being a showstopper, M&S should be reconsidered too. I happen to be a fan of both approaches . The worst part of M&S-on-RC (== the one I never had a good answer for) is that a non-cooperating extension type E can't be chased, hence objects reachable only from objects of type E never get marked, so are vulnerable to bogus collection. In the Neil/Toby scheme, objects of type E merely act as sources of "external" references, so the scheme fails safe (in the sense of never doing a bogus collection due to non-cooperating types). Hmm ... if both approaches converge on keeping a list of all chaseable objects, and being careful of uncoopoerating types, maybe the only real difference in the end is whether the root set is given explicitly (as in traditional M&S) or inferred indirectly (but where "root set" has a different meaning in the scheme you sketched). > ... > In our case, we may need a type-specific "clear" function for containers > in the type object. I think definitely, yes. full-speed-sideways -ly y'rs - tim From mal@lemburg.com Wed Mar 1 10:40:36 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 01 Mar 2000 11:40:36 +0100 Subject: [Python-Dev] breaking list.append() References: <011001bf835e$600d1da0$34aab5d4@hagrid> Message-ID: <38BCF3A4.1CCADFCE@lemburg.com> Fredrik Lundh wrote: > > Greg Stein wrote: > > On Wed, 1 Mar 2000, Fredrik Lundh wrote: > > > Greg Stein wrote: > > > > Note that Guido posted a note to c.l.py on Monday. I believe that meets > > > > your notification criteria. > > > > > > ahem. do you seriously believe that everyone in the > > > Python universe reads comp.lang.python? > > > > > > afaik, most Python programmers don't. > > > > Now you're simply taking my comments out of context. Not a proper thing to > > do. Ken said that he wanted notification along certain guidelines. I said > > that I believed Guido's post did just that. Period. > > my point was that most Python programmers won't > see that notification. when these people download > 1.6 final and find that all theirs apps just broke, they > probably won't be happy with a pointer to dejanews. Dito. Anyone remember the str(2L) == '2' change, BTW ? That one will cost lots of money in case someone implemented an eShop using the common str(2L)[:-1] idiom... There will need to be a big warning sign somewhere that people see *before* finding the download link. (IMHO, anyways.) > > And which is that? Care to help out? Maybe just a little bit? > > this rather common pydiom: > > append = list.append > for x in something: > append(...) > > it's used a lot where performance matters. Same here. checkappend.py doesn't find these (a great tool BTW, thanks Tim; I noticed that it leaks memory badly though). > > Or do you just want to talk about how bad this change is? :-( > > yes, I think it's bad. I've been using Python since 1.2, > and no other change has had the same consequences > (wrt. time/money required to fix it) > > call me a crappy programmer if you want, but I'm sure > there are others out there who are nearly as bad. and > lots of them won't be aware of this change until some- > one upgrades the python interpreter on their server. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@python.org Wed Mar 1 12:07:42 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Mar 2000 07:07:42 -0500 Subject: need .append patch (was RE: [Python-Dev] Re: Python-checkins digest, Vol 1 #370 - 8 msgs) In-Reply-To: Your message of "Wed, 01 Mar 2000 00:57:49 EST." <000601bf8343$13575040$412d153f@tim> References: <000601bf8343$13575040$412d153f@tim> Message-ID: <200003011207.HAA13342@eric.cnri.reston.va.us> > To the extent that you're serious about CP4E, you're begging for more of > this, not less . Which is exactly why I am breaking multi-arg append now -- this is my last chance. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Mar 1 12:27:10 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Mar 2000 07:27:10 -0500 Subject: [Python-Dev] Unicode mapping tables In-Reply-To: Your message of "Wed, 01 Mar 2000 09:38:52 +0100." <38BCD71C.3592E6A@lemburg.com> References: <000701bf834a$77acdfe0$412d153f@tim> <38BCD71C.3592E6A@lemburg.com> Message-ID: <200003011227.HAA13396@eric.cnri.reston.va.us> > Here's what I'll do: > > * implement .capitalize() in the traditional way for Unicode > objects (simply convert the first char to uppercase) > * implement u.title() to mean the same as Java's toTitleCase() > * don't implement s.title(): the reasoning here is that it would > confuse the user when she get's different return values for > the same string (titlecase chars usually live in higher Unicode > code ranges not reachable in Latin-1) Huh? For ASCII at least, titlecase seems to map to ASCII; in your current implementation, only two Latin-1 characters (u'\265' and u'\377', I have no easy way to show them in Latin-1) map outside the Latin-1 range. Anyway, I would suggest to add a title() call to 8-bit strings as well; then we can do away with string.capwords(), which does something similar but different, mostly by accident. --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.nl Wed Mar 1 12:34:42 2000 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 01 Mar 2000 13:34:42 +0100 Subject: [Python-Dev] Re: A warning switch? In-Reply-To: Message by Guido van Rossum , Mon, 28 Feb 2000 12:35:12 -0500 , <200002281735.MAA27771@eric.cnri.reston.va.us> Message-ID: <20000301123442.7DEF8371868@snelboot.oratrix.nl> > > What about adding a command-line switch for enabling warnings, as has > > been suggested long ago? The .append() change could then print a > > warning in 1.6alphas (and betas?), but still run, and be turned into > > an error later. > > That's better. I propose that the warnings are normally on, and that > there are flags to turn them off or thrn them into errors. Can we then please have an interface to the "give warning" call (in stead of a simple fprintf)? On the mac (and possibly also in PythonWin) it's probably better to pop up a dialog (possibly with a "don't show again" button) than do a printf which may get lost. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@python.org Wed Mar 1 12:55:42 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Mar 2000 07:55:42 -0500 Subject: [Python-Dev] Re: A warning switch? In-Reply-To: Your message of "Wed, 01 Mar 2000 13:34:42 +0100." <20000301123442.7DEF8371868@snelboot.oratrix.nl> References: <20000301123442.7DEF8371868@snelboot.oratrix.nl> Message-ID: <200003011255.HAA13489@eric.cnri.reston.va.us> > Can we then please have an interface to the "give warning" call (in > stead of a simple fprintf)? On the mac (and possibly also in > PythonWin) it's probably better to pop up a dialog (possibly with a > "don't show again" button) than do a printf which may get lost. Sure. All you have to do is code it (or get someone else to code it). <0.9 wink> --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Mar 1 13:32:02 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 01 Mar 2000 14:32:02 +0100 Subject: [Python-Dev] Unicode mapping tables References: <000701bf834a$77acdfe0$412d153f@tim> <38BCD71C.3592E6A@lemburg.com> <200003011227.HAA13396@eric.cnri.reston.va.us> Message-ID: <38BD1BD2.792E9B73@lemburg.com> Guido van Rossum wrote: > > > Here's what I'll do: > > > > * implement .capitalize() in the traditional way for Unicode > > objects (simply convert the first char to uppercase) > > * implement u.title() to mean the same as Java's toTitleCase() > > * don't implement s.title(): the reasoning here is that it would > > confuse the user when she get's different return values for > > the same string (titlecase chars usually live in higher Unicode > > code ranges not reachable in Latin-1) > > Huh? For ASCII at least, titlecase seems to map to ASCII; in your > current implementation, only two Latin-1 characters (u'\265' and > u'\377', I have no easy way to show them in Latin-1) map outside the > Latin-1 range. You're right, sorry for the confusion. I was thinking of other encodings like e.g. cp437 which have corresponding characters in the higher Unicode ranges. > Anyway, I would suggest to add a title() call to 8-bit strings as > well; then we can do away with string.capwords(), which does something > similar but different, mostly by accident. Ok, I'll do it this way then: s.title() will use C's toupper() and tolower() for case mapping and u.title() the Unicode routines. This will be in sync with the rest of the 8-bit string world (which is locale aware on many platforms AFAIK), even though it might not return the same string as the corresponding u.title() call. u.capwords() will be disabled in the Unicode implemetation... it wasn't even implemented for the string implementetation, so there's no breakage ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Wed Mar 1 14:59:07 2000 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Wed, 1 Mar 2000 09:59:07 -0500 (EST) Subject: [Python-Dev] breaking list.append() In-Reply-To: <011001bf835e$600d1da0$34aab5d4@hagrid> References: <011001bf835e$600d1da0$34aab5d4@hagrid> Message-ID: <14525.12347.120543.804804@amarok.cnri.reston.va.us> Fredrik Lundh writes: >yes, I think it's bad. I've been using Python since 1.2, >and no other change has had the same consequences >(wrt. time/money required to fix it) There are more things in 1.6 that might require fixing existing code: str(2L) returning '2', the int/long changes, the Unicode changes, and if it gets added, garbage collection -- and bugs caused by those changes might not be catchable by a nanny. IMHO it's too early to point at the .append() change as breaking too much existing code; there may be changes that break a lot more. I'd wait and see what happens once the 1.6 alphas become available; if c.l.p is filled with shrieks and groans, GvR might decide to back the offending change out. (Or he might not...) -- A.M. Kuchling http://starship.python.net/crew/amk/ I have no skills with machines. I fear them, and because I cannot help attributing human qualities to them, I suspect that they hate me and will kill me if they can. -- Robertson Davies, "Reading" From klm@digicool.com Wed Mar 1 15:37:49 2000 From: klm@digicool.com (Ken Manheimer) Date: Wed, 1 Mar 2000 10:37:49 -0500 (EST) Subject: [Python-Dev] breaking list.append() In-Reply-To: Message-ID: On Tue, 29 Feb 2000, Greg Stein wrote: > On Tue, 29 Feb 2000, Ken Manheimer wrote: > >... > > None the less, for those practicing it, the incorrectness of it will be > > fresh news. I would be less sympathetic with them if there was recent > > warning, eg, the schedule for changing it in the next release was part of > > the current release. But if you tell somebody you're going to change > > something, and then don't for a few years, you probably need to renew the > > warning before you make the change. Don't you think so? Why not? > > I agree. > > Note that Guido posted a note to c.l.py on Monday. I believe that meets > your notification criteria. Actually, by "part of the current release", i meant having the deprecation/impending-deletion warning in the release notes for the release before the one where the deletion happens - saying it's being deprecated now, will be deleted next time around. Ken klm@digicool.com I mean, you tell one guy it's blue. He tells his guy it's brown, and it lands on the page sorta purple. Wavy Gravy/Hugh Romney From Vladimir.Marangozov@inrialpes.fr Wed Mar 1 17:07:07 2000 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Wed, 1 Mar 2000 18:07:07 +0100 (CET) Subject: [Python-Dev] Re: [Patches] Reference cycle collection for Python In-Reply-To: <200003010544.AAA13155@eric.cnri.reston.va.us> from "Guido van Rossum" at Mar 01, 2000 12:44:10 AM Message-ID: <200003011707.SAA01310@python.inrialpes.fr> Guido van Rossum wrote: > > Thanks for the new patches, Neil! Thanks from me too! I notice, however, that hash_resize() still uses a malloc call instead of PyMem_NEW. Neil, please correct this in your version immediately ;-) > > We had a visitor here at CNRI today, Eric Tiedemann > , who had a look at your patches before. Eric > knows his way around the Scheme, Lisp and GC literature, and presented > a variant on your approach which takes the bite out of the recursive > passes. Avoiding the recursion is valuable, as long we're optimizing the implementation of one particular scheme. It doesn't bother me that Neil's scheme is recursive, because I still perceive his code as a proof of concept. You're presenting here another scheme based on refcounts arithmetic, generalized for all container types. The linked list implementation of this generalized scheme is not directly related to the logic. I have some suspitions on the logic, so you'll probably want to elaborate a bit more on it, and convince me that this scheme would actually work. > Today, Eric proposed to do away with Neil's hash table altogether -- > as long as we're wasting memory, we might as well add 3 fields to each > container object rather than allocating the same amount in a separate > hash table. I cannot agree so easily with this statement, but you should have expecting this from me :-) If we're about to opimize storage, I have good reasons to believe that we don't need 3 additional slots per container (but 1 for gc_refs, yes). We could certainly envision allocating the containers within memory pools of 4K (just as it is done in pymalloc, and close to what we have for ints & floats). These pools would be labaled as "container's memory", they would obviously be under our control, and we'd have additional slots per pool, not per object. As long as we isolate the containers from the rest, we can enumerate them easily by walking though the pools. But I'm willing to defer this question for now, as it involves the object allocators (the builtin allocators + PyObject_NEW for extension types E -- user objects of type E would be automatically taken into account for GC if there's a flag in the type struct which identifies them as containers). > Eric expects that this will run faster, although this obviously needs > to be tried. Definitely, although I trust Eric & Tim :-) > > Container types are: dict, list, tuple, class, instance; plus > potentially user-defined container types such as kjbuckets. I have a > feeling that function objects should also be considered container > types, because of the cycle involving globals. + other extension container types. And I insist. Don't forget that we're planning to merge types and classes... > > Eric's algorithm, then, consists of the following parts. > > Each container object has three new fields: gc_next, gc_prev, and > gc_refs. (Eric calls the gc_refs "refcount-zero".) > > We color objects white (initial), gray (root), black (scanned root). > (The terms are explained later; we believe we don't actually need bits > in the objects to store the color; see later.) > > All container objects are chained together in a doubly-linked list -- > this is the same as Neil's code except Neil does it only for dicts. > (Eric postulates that you need a list header.) > > When GC is activated, all objects are colored white; we make a pass > over the entire list and set gc_refs equal to the refcount for each > object. Step 1: for all containers, c->gc_refs = c->ob_refcnt > > Next, we make another pass over the list to collect the internal > references. Internal references are (just like in Neil's version) > references from other container types. In Neil's version, this was > recursive; in Eric's version, we don't need recursion, since the list > already contains all containers. So we simple visit the containers in > the list in turn, and for each one we go over all the objects it > references and subtract one from *its* gc_refs field. (Eric left out > the little detail that we ened to be able to distinguish between > container and non-container objects amongst those references; this can > be a flag bit in the type field.) Step 2: c->gc_refs = c->gc_refs - Nb_referenced_containers_from_c I guess that you realize that after this step, gc_refs can be zero or negative. I'm not sure that you collect "internal" references here (references from other container types). A list referencing 20 containers, being itself referenced by one container + one static variable + two times from the runtime stack, has an initial refcount == 4, so we'll end up with gc_refs == -16. A tuple referencing 1 list, referenced once by the stack, will end up with gc_refs == 0. Neil's scheme doesn't seem to have this "property". > > Now, similar to Neil's version, all objects for which gc_refs == 0 > have only internal references, and are potential garbage; all objects > for which gc_refs > 0 are "roots". These have references to them from > other places, e.g. from globals or stack frames in the Python virtual > machine. > Agreed, some roots have gc_refs > 0 I'm not sure that all of them have it, though... Do they? > We now start a second list, to which we will move all roots. The way > to do this is to go over the first list again and to move each object > that has gc_refs > 0 to the second list. Objects placed on the second > list in this phase are considered colored gray (roots). > Step 3: Roots with gc_refs > 0 go to the 2nd list. All c->gc_refs <= 0 stay in the 1st list. > Of course, some roots will reference some non-roots, which keeps those > non-roots alive. We now make a pass over the second list, where for > each object on the second list, we look at every object it references. > If a referenced object is a container and is still in the first list > (colored white) we *append* it to the second list (colored gray). > Because we append, objects thus added to the second list will > eventually be considered by this same pass; when we stop finding > objects that sre still white, we stop appending to the second list, > and we will eventually terminate this pass. Conceptually, objects on > the second list that have been scanned in this pass are colored black > (scanned root); but there is no need to to actually make the > distinction. > Step 4: Closure on reachable containers which are all moved to the 2nd list. (Assuming that the objects are checked only via their type, without involving gc_refs) > (How do we know whether an object pointed to is white (in the first > list) or gray or black (in the second)? Good question? :-) > We could use an extra bitfield, but that's a waste of space. > Better: we could set gc_refs to a magic value (e.g. 0xffffffff) when > we move the object to the second list. I doubt that this would work for the reasons mentioned above. > During the meeting, I proposed to set the back pointer to NULL; that > might work too but I think the gc_refs field is more elegant. We could > even just test for a non-zero gc_refs field; the roots moved to the > second list initially all have a non-zero gc_refs field already, and > for the objects with a zero gc_refs field we could indeed set it to > something arbitrary.) Not sure that "arbitrary" is a good choice if the differentiation is based solely on gc_refs. > > Once we reach the end of the second list, all objects still left in > the first list are garbage. We can destroy them in a similar to the > way Neil does this in his code. Neil calls PyDict_Clear on the > dictionaries, and ignores the rest. Under Neils assumption that all > cycles (that he detects) involve dictionaries, that is sufficient. In > our case, we may need a type-specific "clear" function for containers > in the type object. Couldn't this be done in the object's dealloc function? Note that both Neil's and this scheme assume that garbage _detection_ and garbage _collection_ is an atomic operation. I must say that I don't care of having some living garbage if it doesn't hurt my work. IOW, the used criterion for triggering the detection phase _may_ eventually differ from the one used for the collection phase. But this is where we reach the incremental approaches, implying different reasoning as a whole. My point is that the introduction of a "clear" function depends on the adopted scheme, whose logic depends on pertinent statistics on memory consumption of the cyclic garbage. To make it simple, we first need stats on memory consumption, then we can discuss objectively on how to implement some particular GC scheme. I second Eric on the need for excellent statistics. > > The general opinion was that we should first implement and test the > algorithm as sketched above, and then changes or extensions could be > made. I'd like to see it discussed first in conjunction with (1) the possibility of having a proprietary malloc, (2) the envisioned type/class unification. Perhaps I'm getting too deep, but once something gets in, it's difficult to take it out, even when a better solution is found subsequently. Although I'm enthousiastic about this work on GC, I'm not in a position to evaluate the true benefits of the proposed schemes, as I still don't have a basis for evaluating how much garbage my program generates and whether it hurts the interpreter compared to its overal memory consumption. > > I was pleasantly surprised to find Neil's code in my inbox when we > came out of the meeting; I think it would be worthwhile to compare and > contrast the two approaches. (Hm, maybe there's a paper in it?) I'm all for it! -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From jeremy@cnri.reston.va.us Wed Mar 1 17:53:13 2000 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Wed, 1 Mar 2000 12:53:13 -0500 (EST) Subject: [Python-Dev] Re: [Patches] Reference cycle collection for Python In-Reply-To: <200003011707.SAA01310@python.inrialpes.fr> References: <200003010544.AAA13155@eric.cnri.reston.va.us> <200003011707.SAA01310@python.inrialpes.fr> Message-ID: <14525.22793.963077.707198@goon.cnri.reston.va.us> >>>>> "VM" == Vladimir Marangozov writes: [">>" == Guido explaining Eric Tiedemann's GC design] >> Next, we make another pass over the list to collect the internal >> references. Internal references are (just like in Neil's >> version) references from other container types. In Neil's >> version, this was recursive; in Eric's version, we don't need >> recursion, since the list already contains all containers. So we >> simple visit the containers in the list in turn, and for each one >> we go over all the objects it references and subtract one from >> *its* gc_refs field. (Eric left out the little detail that we >> ened to be able to distinguish between container and >> non-container objects amongst those references; this can be a >> flag bit in the type field.) VM> Step 2: c->gc_refs = c->gc_refs - VM> Nb_referenced_containers_from_c VM> I guess that you realize that after this step, gc_refs can be VM> zero or negative. I think Guido's explanation is slightly ambiguous. When he says, "subtract one from *its" gc_refs field" he means subtract one from the _contained_ object's gc_refs field. VM> I'm not sure that you collect "internal" references here VM> (references from other container types). A list referencing 20 VM> containers, being itself referenced by one container + one VM> static variable + two times from the runtime stack, has an VM> initial refcount == 4, so we'll end up with gc_refs == -16. The strategy is not that the container's gc_refs is decremented once for each object it contains. Rather, the container decrements each contained object's gc_refs by one. So you should never end of with gc_refs < 0. >> During the meeting, I proposed to set the back pointer to NULL; >> that might work too but I think the gc_refs field is more >> elegant. We could even just test for a non-zero gc_refs field; >> the roots moved to the second list initially all have a non-zero >> gc_refs field already, and for the objects with a zero gc_refs >> field we could indeed set it to something arbitrary.) I believe we discussed this further and concluded that setting the back pointer to NULL would not work. If we make the second list doubly-linked (like the first one), it is trivial to end GC by swapping the first and second lists. If we've zapped the NULL pointer, then we have to go back and re-set them all. Jeremy From mal@lemburg.com Wed Mar 1 18:44:58 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 01 Mar 2000 19:44:58 +0100 Subject: [Python-Dev] Unicode Snapshot 2000-03-01 Message-ID: <38BD652A.EA2EB0A3@lemburg.com> There is a new Unicode implementation snaphot available at the secret URL. It contains quite a few small changes to the internal APIs, doc strings for all methods and some new methods (e.g. .title()) on the Unicode and the string objects. The code page mappings are now integer->integer which should make them more performant. Some of the C codec APIs have changed, so you may need to adapt code that already uses these (Fredrik ?!). Still missing is a MSVC project file... haven't gotten around yet to build one. The code does compile on WinXX though, as Finn Bock told me in private mail. Please try out the new stuff... Most interesting should be the code in Lib/codecs.py as it provides a very high level interface to all those builtin codecs. BTW: I would like to implement a .readline() method using only the .read() method as basis. Does anyone have a good idea on how this could be done without buffering ? (Unicode has a slightly larger choice of line break chars as C; the .splitlines() method will deal with these) Gotta run... -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Fredrik Lundh" <011001bf835e$600d1da0$34aab5d4@hagrid> <14525.12347.120543.804804@amarok.cnri.reston.va.us> Message-ID: <034a01bf83b3$e97c8620$34aab5d4@hagrid> Andrew M. Kuchling wrote: > There are more things in 1.6 that might require fixing existing code: > str(2L) returning '2', the int/long changes, the Unicode changes, and > if it gets added, garbage collection -- and bugs caused by those > changes might not be catchable by a nanny. hey, you make it sound like "1.6" should really be "2.0" ;-) From nascheme@enme.ucalgary.ca Wed Mar 1 19:29:02 2000 From: nascheme@enme.ucalgary.ca (nascheme@enme.ucalgary.ca) Date: Wed, 1 Mar 2000 12:29:02 -0700 Subject: [Python-Dev] Re: [Patches] Reference cycle collection for Python In-Reply-To: <200003011707.SAA01310@python.inrialpes.fr>; from marangoz@python.inrialpes.fr on Wed, Mar 01, 2000 at 06:07:07PM +0100 References: <200003010544.AAA13155@eric.cnri.reston.va.us> <200003011707.SAA01310@python.inrialpes.fr> Message-ID: <20000301122902.B7773@acs.ucalgary.ca> On Wed, Mar 01, 2000 at 06:07:07PM +0100, Vladimir Marangozov wrote: > Guido van Rossum wrote: > > Once we reach the end of the second list, all objects still left in > > the first list are garbage. We can destroy them in a similar to the > > way Neil does this in his code. Neil calls PyDict_Clear on the > > dictionaries, and ignores the rest. Under Neils assumption that all > > cycles (that he detects) involve dictionaries, that is sufficient. In > > our case, we may need a type-specific "clear" function for containers > > in the type object. > > Couldn't this be done in the object's dealloc function? No, I don't think so. The object still has references to it. You have to be careful about how you break cycles so that memory is not accessed after it is freed. Neil -- "If elected mayor, my first act will be to kill the whole lot of you, and burn your town to cinders!" -- Groundskeeper Willie From gvwilson@nevex.com Wed Mar 1 20:19:30 2000 From: gvwilson@nevex.com (gvwilson@nevex.com) Date: Wed, 1 Mar 2000 15:19:30 -0500 (EST) Subject: [Python-Dev] DDJ article on Python GC Message-ID: Jon Erickson (editor-in-chief) of "Doctor Dobb's Journal" would like an article on what's involved in adding garbage collection to Python. Please email me if you're interested in tackling it... Thanks, Greg From fdrake@acm.org Wed Mar 1 20:37:49 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 1 Mar 2000 15:37:49 -0500 (EST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src Makefile.in,1.82,1.83 In-Reply-To: References: <14523.56638.286603.340358@weyr.cnri.reston.va.us> Message-ID: <14525.32669.909212.716484@weyr.cnri.reston.va.us> Greg Stein writes: > Isn't the documentation better than what has been released? In other > words, if you release now, how could you make things worse? If something > does turn up during a check, you can always release again... Releasing is still somewhat tedious, and I don't want to ask people to do several substantial downloads & installs. So far, a major navigation bug has been fonud in the test version I posted (just now fixed online); *thats* why I don't like to release too hastily! I don't think waiting two more weeks is a problem. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From guido@python.org Wed Mar 1 22:53:26 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Mar 2000 17:53:26 -0500 Subject: [Python-Dev] DDJ article on Python GC In-Reply-To: Your message of "Wed, 01 Mar 2000 15:19:30 EST." References: Message-ID: <200003012253.RAA16056@eric.cnri.reston.va.us> > Jon Erickson (editor-in-chief) of "Doctor Dobb's Journal" would like an > article on what's involved in adding garbage collection to Python. Please > email me if you're interested in tackling it... I might -- although I should get Neil, Eric and Tim as co-authors. I'm halfway implementing the scheme that Eric showed yesterday. It's very elegant, but I don't have an idea about its impact performance yet. Say hi to Jon -- we've met a few times. I liked his March editorial, having just read the same book and had the same feeling of "wow, an open source project in the 19th century!" --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond@skippinet.com.au Wed Mar 1 23:09:23 2000 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 2 Mar 2000 10:09:23 +1100 Subject: [Python-Dev] Re: A warning switch? In-Reply-To: <200003011255.HAA13489@eric.cnri.reston.va.us> Message-ID: > > Can we then please have an interface to the "give warning" call (in > > stead of a simple fprintf)? On the mac (and possibly also in > > PythonWin) it's probably better to pop up a dialog (possibly with a > > "don't show again" button) than do a printf which may get lost. > > Sure. All you have to do is code it (or get someone else to code it). How about just having either a "sys.warning" function, or maybe even a sys.stdwarn stream? Then a simple C API to call this, and we are done :-) sys.stdwarn sounds OK - it just defaults to sys.stdout, so the Mac and Pythonwin etc should "just work" by sending the output wherever sys.stdout goes today... Mark. From tim_one@email.msn.com Thu Mar 2 05:08:39 2000 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 2 Mar 2000 00:08:39 -0500 Subject: [Python-Dev] breaking list.append() In-Reply-To: <38BCF3A4.1CCADFCE@lemburg.com> Message-ID: <001001bf8405$5f9582c0$732d153f@tim> [/F] > append = list.append > for x in something: > append(...) [M.-A. Lemburg] > Same here. checkappend.py doesn't find these As detailed in a c.l.py posting, I have yet to find a single instance of this actually called with multiple arguments. Pointing out that it's *possible* isn't the same as demonstrating it's an actual problem. I'm quite willing to believe that it is, but haven't yet seen evidence of it. For whatever reason, people seem much (and, in my experience so far, infinitely ) more prone to make the list.append(1, 2, 3) error than the maybethisisanappend(1, 2, 3) error. > (a great tool BTW, thanks Tim; I noticed that it leaks memory badly > though). Which Python? Which OS? How do you know? What were you running it over? Using 1.5.2 under Win95, according to wintop, & over the whole CVS tree, the total (code + data) virtual memory allocated to it peaked at about 2Mb a few seconds into the run, and actually decreased as time went on. So, akin to the bound method multi-argument append problem, the "checkappend leak problem" is something I simply have no reason to believe . Check your claim again? checkappend.py itself obviously creates no cycles or holds on to any state across files, so if you're seeing a leak it must be a bug in some other part of the version of Python + std libraries you're using. Maybe a new 1.6 bug? Something you did while adding Unicode? Etc. Tell us what you were running. Has anyone else seen a leak? From tim_one@email.msn.com Thu Mar 2 05:50:19 2000 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 2 Mar 2000 00:50:19 -0500 Subject: [Python-Dev] str vs repr at prompt again (FW: String printing behavior?) Message-ID: <001401bf840b$3177ba60$732d153f@tim> Another unsolicited testimonial that countless users are oppressed by auto-repr (as opposed to auto-str) at the interpreter prompt. Just trying to keep a once-hot topic from going stone cold forever . -----Original Message----- From: python-list-admin@python.org [mailto:python-list-admin@python.org] On Behalf Of Ted Drain Sent: Wednesday, March 01, 2000 5:42 PM To: python-list@python.org Subject: String printing behavior? Hi all, I've got a question about the string printing behavior. If I define a functions as: >>> def foo(): ... return "line1\nline2" >>> foo() 'line1\013line2' >>> print foo() line1 line2 >>> It seems to me that the default printing behavior for strings should match behavior of the print routine. I realize that some people may want to see embedded control codes, but I would advocate a seperate method for printing raw byte sequences. We are using the python interactive prompt as a pseudo-matlab like user interface and the current printing behavior is very confusing to users. It also means that functions that return text (like help routines) must print the string rather than returning it. Returning the string is much more flexible because it allows the string to be captured easily and redirected. Any thoughts? Ted -- Ted Drain Jet Propulsion Laboratory Ted.Drain@jpl.nasa.gov -- http://www.python.org/mailman/listinfo/python-list From mal@lemburg.com Thu Mar 2 07:42:33 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 02 Mar 2000 08:42:33 +0100 Subject: [Python-Dev] breaking list.append() References: <001001bf8405$5f9582c0$732d153f@tim> Message-ID: <38BE1B69.E0B88B41@lemburg.com> Tim Peters wrote: > > [/F] > > append = list.append > > for x in something: > > append(...) > > [M.-A. Lemburg] > > Same here. checkappend.py doesn't find these > > As detailed in a c.l.py posting, I have yet to find a single instance of > this actually called with multiple arguments. Pointing out that it's > *possible* isn't the same as demonstrating it's an actual problem. I'm > quite willing to believe that it is, but haven't yet seen evidence of it. Haven't had time to check this yet, but I'm pretty sure there are some instances of this idiom in my code. Note that I did in fact code like this on purpose: it saves a tuple construction for every append, which can make a difference in tight loops... > For whatever reason, people seem much (and, in my experience so far, > infinitely ) more prone to make the > > list.append(1, 2, 3) > > error than the > > maybethisisanappend(1, 2, 3) > > error. Of course... still there are hidden instances of the problem which are yet to be revealed. For my own code the siutation is even worse, since I sometimes did: add = list.append for x in y: add(x,1,2) > > (a great tool BTW, thanks Tim; I noticed that it leaks memory badly > > though). > > Which Python? Which OS? How do you know? What were you running it over? That's Python 1.5 on Linux2. I let the script run over a large lib directory and my projects directory. In the projects directory the script consumed as much as 240MB of process size. > Using 1.5.2 under Win95, according to wintop, & over the whole CVS tree, the > total (code + data) virtual memory allocated to it peaked at about 2Mb a few > seconds into the run, and actually decreased as time went on. So, akin to > the bound method multi-argument append problem, the "checkappend leak > problem" is something I simply have no reason to believe . Check your > claim again? checkappend.py itself obviously creates no cycles or holds on > to any state across files, so if you're seeing a leak it must be a bug in > some other part of the version of Python + std libraries you're using. > Maybe a new 1.6 bug? Something you did while adding Unicode? Etc. Tell us > what you were running. I'll try the same thing again using Python1.5.2 and the CVS version. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Thu Mar 2 07:46:49 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 02 Mar 2000 08:46:49 +0100 Subject: [Python-Dev] breaking list.append() References: <001001bf8405$5f9582c0$732d153f@tim> <38BE1B69.E0B88B41@lemburg.com> Message-ID: <38BE1C69.C8A9E6B0@lemburg.com> "M.-A. Lemburg" wrote: > > > > (a great tool BTW, thanks Tim; I noticed that it leaks memory badly > > > though). > > > > Which Python? Which OS? How do you know? What were you running it over? > > That's Python 1.5 on Linux2. I let the script run over > a large lib directory and my projects directory. In the > projects directory the script consumed as much as 240MB > of process size. > > > Using 1.5.2 under Win95, according to wintop, & over the whole CVS tree, the > > total (code + data) virtual memory allocated to it peaked at about 2Mb a few > > seconds into the run, and actually decreased as time went on. So, akin to > > the bound method multi-argument append problem, the "checkappend leak > > problem" is something I simply have no reason to believe . Check your > > claim again? checkappend.py itself obviously creates no cycles or holds on > > to any state across files, so if you're seeing a leak it must be a bug in > > some other part of the version of Python + std libraries you're using. > > Maybe a new 1.6 bug? Something you did while adding Unicode? Etc. Tell us > > what you were running. > > I'll try the same thing again using Python1.5.2 and the CVS version. Using the Unicode patched CVS version there's no leak anymore. Couldn't find a 1.5.2 version on my machine... I'll build one later. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@python.org Thu Mar 2 15:32:32 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Mar 2000 10:32:32 -0500 Subject: [Python-Dev] Design question: call __del__ only after successful __init__? Message-ID: <200003021532.KAA17088@eric.cnri.reston.va.us> I was looking at the code that invokes __del__, with the intent to implement a feature from Java: in Java, a finalizer is only called once per object, even if calling it makes the object live longer. To implement this, we need a flag in each instance that means "__del__ was called". I opened the creation code for instances, looking for the right place to set the flag. I then realized that it might be smart, now that we have this flag anyway, to set it to "true" during initialization. There are a number of exits from the initialization where the object is created but not fully initialized, where the new object is DECREF'ed and NULL is returned. When such an exit is taken, __del__ is called on an incompletely initialized object! Example: >>> class C: def __del__(self): print "deleting", self >>> x = C(1) !--> deleting <__main__.C instance at 1686d8> Traceback (innermost last): File " ", line 1, in ? TypeError: this constructor takes no arguments >>> Now I have a choice to make. If the class has an __init__, should I clear the flag only after __init__ succeeds? This means that if __init__ raises an exception, __del__ is never called. This is an incompatibility. It's possible that someone has written code that relies on __del__ being called even when __init__ fails halfway, and then their code would break. But it is just as likely that calling __del__ on a partially uninitialized object is a bad mistake, and I am doing all these cases a favor by not calling __del__ when __init__ failed! Any opinions? If nobody speaks up, I'll make the change. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Mar 2 16:44:00 2000 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 2 Mar 2000 11:44:00 -0500 (EST) Subject: [Python-Dev] Design question: call __del__ only after successful __init__? References: <200003021532.KAA17088@eric.cnri.reston.va.us> Message-ID: <14526.39504.36065.657527@anthem.cnri.reston.va.us> >>>>> "GvR" == Guido van Rossum writes: GvR> Now I have a choice to make. If the class has an __init__, GvR> should I clear the flag only after __init__ succeeds? This GvR> means that if __init__ raises an exception, __del__ is never GvR> called. This is an incompatibility. It's possible that GvR> someone has written code that relies on __del__ being called GvR> even when __init__ fails halfway, and then their code would GvR> break. It reminds me of the separation between object allocation and initialization in ObjC. GvR> But it is just as likely that calling __del__ on a partially GvR> uninitialized object is a bad mistake, and I am doing all GvR> these cases a favor by not calling __del__ when __init__ GvR> failed! GvR> Any opinions? If nobody speaks up, I'll make the change. I think you should set the flag right before you call __init__(), i.e. after (nearly all) the C level initialization has occurred. Here's why: your "favor" can easily be accomplished by Python constructs in the __init__(): class MyBogo: def __init__(self): self.get_delified = 0 do_sumtin_exceptional() self.get_delified = 1 def __del__(self): if self.get_delified: ah_sweet_release() -Barry From gstein@lyra.org Thu Mar 2 17:14:35 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 2 Mar 2000 09:14:35 -0800 (PST) Subject: [Python-Dev] Design question: call __del__ only after successful __init__? In-Reply-To: <200003021532.KAA17088@eric.cnri.reston.va.us> Message-ID: On Thu, 2 Mar 2000, Guido van Rossum wrote: >... > But it is just as likely that calling __del__ on a partially > uninitialized object is a bad mistake, and I am doing all these cases > a favor by not calling __del__ when __init__ failed! > > Any opinions? If nobody speaks up, I'll make the change. +1 on calling __del__ IFF __init__ completes successfully. Cheers, -g -- Greg Stein, http://www.lyra.org/ From jeremy@cnri.reston.va.us Thu Mar 2 17:15:14 2000 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Thu, 2 Mar 2000 12:15:14 -0500 (EST) Subject: [Python-Dev] str vs repr at prompt again (FW: String printing behavior?) In-Reply-To: <001401bf840b$3177ba60$732d153f@tim> References: <001401bf840b$3177ba60$732d153f@tim> Message-ID: <14526.41378.374653.497993@goon.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Another unsolicited testimonial that countless users are TP> oppressed by auto-repr (as opposed to auto-str) at the TP> interpreter prompt. Just trying to keep a once-hot topic from TP> going stone cold forever . [Signature from the included message:] >> -- Ted Drain Jet Propulsion Laboratory Ted.Drain@jpl.nasa.gov -- This guy is probably a rocket scientist. We want the language to be useful for everybody, not just rocket scientists. Jeremy From guido@python.org Thu Mar 2 22:45:37 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Mar 2000 17:45:37 -0500 Subject: [Python-Dev] Design question: call __del__ only after successful __init__? In-Reply-To: Your message of "Thu, 02 Mar 2000 11:44:00 EST." <14526.39504.36065.657527@anthem.cnri.reston.va.us> References: <200003021532.KAA17088@eric.cnri.reston.va.us> <14526.39504.36065.657527@anthem.cnri.reston.va.us> Message-ID: <200003022245.RAA20265@eric.cnri.reston.va.us> > >>>>> "GvR" == Guido van Rossum writes: > > GvR> Now I have a choice to make. If the class has an __init__, > GvR> should I clear the flag only after __init__ succeeds? This > GvR> means that if __init__ raises an exception, __del__ is never > GvR> called. This is an incompatibility. It's possible that > GvR> someone has written code that relies on __del__ being called > GvR> even when __init__ fails halfway, and then their code would > GvR> break. [Barry] > It reminds me of the separation between object allocation and > initialization in ObjC. Is that good or bad? > GvR> But it is just as likely that calling __del__ on a partially > GvR> uninitialized object is a bad mistake, and I am doing all > GvR> these cases a favor by not calling __del__ when __init__ > GvR> failed! > > GvR> Any opinions? If nobody speaks up, I'll make the change. > > I think you should set the flag right before you call __init__(), > i.e. after (nearly all) the C level initialization has occurred. > Here's why: your "favor" can easily be accomplished by Python > constructs in the __init__(): > > class MyBogo: > def __init__(self): > self.get_delified = 0 > do_sumtin_exceptional() > self.get_delified = 1 > > def __del__(self): > if self.get_delified: > ah_sweet_release() But the other behavior (call __del__ even when __init__ fails) can also easily be accomplished in Python: class C: def __init__(self): try: ...stuff that may fail... except: self.__del__() raise def __del__(self): ...cleanup... I believe that in almost all cases the programmer would be happier if __del__ wasn't called when their __init__ fails. This makes it easier to write a __del__ that can assume that all the object's fields have been properly initialized. In my code, typically when __init__ fails, this is a symptom of a really bad bug (e.g. I just renamed one of __init__'s arguments and forgot to fix all references), and I don't care much about cleanup behavior. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us Thu Mar 2 22:52:31 2000 From: bwarsaw@cnri.reston.va.us (bwarsaw@cnri.reston.va.us) Date: Thu, 2 Mar 2000 17:52:31 -0500 (EST) Subject: [Python-Dev] Design question: call __del__ only after successful __init__? References: <200003021532.KAA17088@eric.cnri.reston.va.us> <14526.39504.36065.657527@anthem.cnri.reston.va.us> <200003022245.RAA20265@eric.cnri.reston.va.us> Message-ID: <14526.61615.362973.624022@anthem.cnri.reston.va.us> >>>>> "GvR" == Guido van Rossum writes: GvR> But the other behavior (call __del__ even when __init__ GvR> fails) can also easily be accomplished in Python: It's a fair cop. GvR> I believe that in almost all cases the programmer would be GvR> happier if __del__ wasn't called when their __init__ fails. GvR> This makes it easier to write a __del__ that can assume that GvR> all the object's fields have been properly initialized. That's probably fine; I don't have strong feelings either way. -Barry P.S. Interesting what X-Oblique-Strategy was randomly inserted in this message (but I'm not sure which approach is more "explicit" :). -Barry From tim_one@email.msn.com Fri Mar 3 05:38:59 2000 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 3 Mar 2000 00:38:59 -0500 Subject: [Python-Dev] Design question: call __del__ only after successful __init__? In-Reply-To: <200003021532.KAA17088@eric.cnri.reston.va.us> Message-ID: <000001bf84d2$c711e2e0$092d153f@tim> [Guido] > I was looking at the code that invokes __del__, with the intent to > implement a feature from Java: in Java, a finalizer is only called > once per object, even if calling it makes the object live longer. Why? That is, in what way is this an improvement over current behavior? Note that Java is a bit subtle: a finalizer is only called once by magic; explicit calls "don't count". The Java rules add up to quite a confusing mish-mash. Python's rules are *currently* clearer. I deal with possible exceptions in Python constructors the same way I do in C++ and Java: if there's a destructor, don't put anything in __init__ that may raise an uncaught exception. Anything dangerous is moved into a separate .reset() (or .clear() or ...) method. This works well in practice. > To implement this, we need a flag in each instance that means "__del__ > was called". At least . > I opened the creation code for instances, looking for the right place > to set the flag. I then realized that it might be smart, now that we > have this flag anyway, to set it to "true" during initialization. There > are a number of exits from the initialization where the object is created > but not fully initialized, where the new object is DECREF'ed and NULL is > returned. When such an exit is taken, __del__ is called on an > incompletely initialized object! I agree *that* isn't good. Taken on its own, though, it argues for adding an "instance construction completed" flag that __del__ later checks, as if its body were: if self.__instance_construction_completed: body That is, the problem you've identified here could be addressed directly. > Now I have a choice to make. If the class has an __init__, should I > clear the flag only after __init__ succeeds? This means that if > __init__ raises an exception, __del__ is never called. This is an > incompatibility. It's possible that someone has written code that > relies on __del__ being called even when __init__ fails halfway, and > then their code would break. > > But it is just as likely that calling __del__ on a partially > uninitialized object is a bad mistake, and I am doing all these cases > a favor by not calling __del__ when __init__ failed! > > Any opinions? If nobody speaks up, I'll make the change. I'd be in favor of fixing the actual problem; I don't understand the point to the rest of it, especially as it has the potential to break existing code and I don't see a compensating advantage (surely not compatibility w/ JPython -- JPython doesn't invoke __del__ methods at all by magic, right? or is that changing, and that's what's driving this?). too-much-magic-is-dizzying-ly y'rs - tim From bwarsaw@cnri.reston.va.us Fri Mar 3 05:50:16 2000 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 3 Mar 2000 00:50:16 -0500 (EST) Subject: [Python-Dev] Design question: call __del__ only after successful __init__? References: <200003021532.KAA17088@eric.cnri.reston.va.us> <000001bf84d2$c711e2e0$092d153f@tim> Message-ID: <14527.21144.9421.958311@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> (surely not compatibility w/ JPython -- JPython doesn't invoke TP> __del__ methods at all by magic, right? or is that changing, TP> and that's what's driving this?). No, JPython doesn't invoke __del__ methods by magic, and I don't have any plans to change that. -Barry From ping@lfw.org Fri Mar 3 09:00:21 2000 From: ping@lfw.org (Ka-Ping Yee) Date: Fri, 3 Mar 2000 01:00:21 -0800 (PST) Subject: [Python-Dev] Design question: call __del__ only after successful __init__? In-Reply-To: Message-ID: On Thu, 2 Mar 2000, Greg Stein wrote: > On Thu, 2 Mar 2000, Guido van Rossum wrote: > >... > > But it is just as likely that calling __del__ on a partially > > uninitialized object is a bad mistake, and I am doing all these cases > > a favor by not calling __del__ when __init__ failed! > > > > Any opinions? If nobody speaks up, I'll make the change. > > +1 on calling __del__ IFF __init__ completes successfully. That would be my vote as well. What convinced me of this is the following: If it's up to the implementation of __del__ to deal with a problem that happened during initialization, you only know about the problem with very coarse granularity. It's a pain (or even impossible) to then rediscover the information you need to recover adequately. If on the other hand you deal with the problem in __init__, then you have much better control over what is happening, because you can position try/except blocks precisely where you need them to deal with specific potential problems. Each block can take care of its case appropriately, and re-raise if necessary. In general, it seems to me that what you want to do when __init__ runs afoul is going to be different from what you want to do to take care of object cleanup in __del__. So it doesn't belong there -- it belongs in an except: clause in __init__. Even though it's an incompatibility, i really think this is the right behaviour. -- ?!ng "To be human is to continually change. Your desire to remain as you are is what ultimately limits you." -- The Puppet Master, Ghost in the Shell From guido@python.org Fri Mar 3 16:13:16 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Mar 2000 11:13:16 -0500 Subject: [Python-Dev] Design question: call __del__ only after successful __init__? In-Reply-To: Your message of "Fri, 03 Mar 2000 00:38:59 EST." <000001bf84d2$c711e2e0$092d153f@tim> References: <000001bf84d2$c711e2e0$092d153f@tim> Message-ID: <200003031613.LAA21571@eric.cnri.reston.va.us> > [Guido] > > I was looking at the code that invokes __del__, with the intent to > > implement a feature from Java: in Java, a finalizer is only called > > once per object, even if calling it makes the object live longer. [Tim] > Why? That is, in what way is this an improvement over current behavior? > > Note that Java is a bit subtle: a finalizer is only called once by magic; > explicit calls "don't count". Of course. Same in my proposal. But I wouldn't call it "by magic" -- just "on behalf of the garbage collector". > The Java rules add up to quite a confusing mish-mash. Python's rules are > *currently* clearer. I don't find the Java rules confusing. It seems quite useful that the GC promises to call the finalizer at most once -- this can simplify the finalizer logic. (Otherwise it may have to ask itself, "did I clean this already?" and leave notes for itself.) Explicit finalizer calls are always a mistake and thus "don't count" -- the response to that should in general be "don't do that" (unless you have particularly stupid callers -- or very fearful lawyers :-). > I deal with possible exceptions in Python constructors the same way I do in > C++ and Java: if there's a destructor, don't put anything in __init__ that > may raise an uncaught exception. Anything dangerous is moved into a > separate .reset() (or .clear() or ...) method. This works well in practice. Sure, but the rule "if __init__ fails, __del__ won't be called" means that we don't have to program our __init__ or __del__ quite so defensively. Most people who design a __del__ probably assume that __init__ has run to completion. The typical scenario (which has happened to me! And I *implemented* the damn thing!) is this: __init__ opens a file and assigns it to an instance variable; __del__ closes the file. This is tested a few times and it works great. Now in production the file somehow unexpectedly fails to be openable. Sure, the programmer should've expected that, but she didn't. Now, at best, the failed __del__ creates an additional confusing error message on top of the traceback generated by IOError. At worst, the failed __del__ could wreck the original traceback. Note that I'm not proposing to change the C level behavior; when a Py

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4