I'm bringing the issue on python-dev, because it may change your coding habits in the future. If you have any comments or something... ---------------------------------------------------------------------- Subject: Re: [Patches] Re: Garbage collection patches for Python To: nascheme@enme.ucalgary.ca Date: Wed, 9 Feb 2000 15:59:48 +0100 (CET) Cc: Vladimir.Marangozov@inrialpes.fr, toby@puckish.demon.co.uk, pygc@freemail.hu, patches@python.org (Python Patches), guido@cnri.reston.va.us (Guido van Rossum) In-Reply-To: <20000205092822.A2204@acs.ucalgary.ca> from "nascheme@enme.ucalgary.ca" at Feb 05, 2000 09:28:23 AM Reply-To: Vladimir.Marangozov@inrialpes.fr X-Mailer: ELM [version 2.5 PL1] Content-Length: 6224 nascheme@enme.ucalgary.ca wrote: > > On Sat, Feb 05, 2000 at 03:00:59PM +0100, Vladimir Marangozov wrote: > > Neil, could you please resubmit your 1st GC patch with the disclaimer, > > so that we can focus on this 1st step, test the patch and fold it into > > CVS? > > No problem. I've included my Boehm-Demers-Weiser garbage > collector patch too as I have improved it to use autoconfig. The > gc patch is small compared to the malloc cleanup. Thanks. After a closer look at the patch, I see a couple of problems with it. In particular, it mixes different API families. Let me start from the start: The goal is to remove Python's dependency on the standard POSIX interface (malloc/realloc/free) so that we can cleanly and easily plug in the future a "proprietary" mem manager, other than the one in the C library. For this purpose, the Python core should be patched and "cleaned" to use one or more of the following APIs: 1) PyMem_MALLOC 2) PyMem_NEW PyMem_REALLOC ==> PyMem_RESIZE PyMem_FREE PyMem_DEL PyMem_XDEL The proposed Guido's augmented version of 1) which is raw mem interface. more "Pythonic" and contains some safety additions, like _PyMem_EXTRA). This one (in mymalloc.h) should be defined in terms of 1). 3) Py_Malloc 4) PyMem_Malloc 5) _PyObject_New Py_Realloc PyMem_Realloc PyObject_NEW Py_Free PyMem_Free PyObject_NEW_VAR, ... These are implemented using 1) and/or 2) It seems to me that presently nobody uses the wrappers 3) and 4), (except one single occurence of Py_Malloc, lost in _localemodule.c), because these wrappers cause an additional func call overhead... All these APIs are certainly redundant and the added value of each of them has to be weighted one again, but for now, let's assume that we have them all (without more argumentation - we could throw away some of them later). <EXTRABOLD> The rule of thumb in this situation is: Every chunk of memory must be manupulated via the same malloc family. </EXTRABOLD> That is, if one gets some piece of mem through PyMem_MALLOC (1), s/he must release that memory with PyMem_FREE (1). Accordingly, if one gets a chunk via PyMem_MALLOC (1), that chunk *should not* be released with PyMem_DEL (2). (which is what Neil's patch does, not to mention that (2) is not defined in terms of (1). So we must be careful here. In particular, when patching a file, we must figure out which family (-ies) is (are) used in order to perform the "malloc cleanup" the right way. That is, we have to inspect every file one by one, once again. ----------- With the above theory in mind, I see that there are a couple of "problematic" files in the current snapshot. I remember that I have identified them with my allocator (pymalloc) either. (I'll use some excerpts of Neil's patch to illustrate the problems I'm talking about) 1) pypcre.c This one is "buggy" for sure. The PCRE code allocates memory through the proprietary functions "pcre_malloc/pcre_free" (which default to malloc/free), so I really don't see why there's code inside mentioning "free" and not "pcre_free". a) Would this code be dropped with the inclusion of /F's re engine in 1.6? b) If the answer to a) is negative, I hope AMK & the String-SIG will help us on this (it's a hairy code :-). > diff -cr Python-cvs/Modules/pypcre.c Python-gc/Modules/pypcre.c > *** Python-cvs/Modules/pypcre.c Sat Jul 17 12:13:10 1999 > --- Python-gc/Modules/pypcre.c Sat Feb 5 09:00:47 2000 > *************** > *** 3057,3068 **** > static int free_stack(match_data *md) > { > /* Free any stack space that was allocated by the call to match(). */ > ! if (md->off_num) free(md->off_num); > ! if (md->offset_top) free(md->offset_top); > ! if (md->r1) free(md->r1); > ! if (md->r2) free(md->r2); > ! if (md->eptr) free((char *)md->eptr); > ! if (md->ecode) free((char *)md->ecode); > return 0; > } > > --- 3057,3068 ---- > static int free_stack(match_data *md) > { > /* Free any stack space that was allocated by the call to match(). */ > ! if (md->off_num) PyMem_DEL(md->off_num); > ! if (md->offset_top) PyMem_DEL(md->offset_top); > ! if (md->r1) PyMem_DEL(md->r1); > ! if (md->r2) PyMem_DEL(md->r2); > ! if (md->eptr) PyMem_DEL((char *)md->eptr); > ! if (md->ecode) PyMem_DEL((char *)md->ecode); > return 0; > } 2) _tkinter.c I remember from my experiments that this one was really a mess from a malloc point of view because it caused lots of mixed API calls (once the core allocator was changed). I have to look at it carefully once again... 3) readline.c Neil, what's this? Could you elaborate on this one? > > diff -cr Python-cvs/Modules/readline.c Python-gc/Modules/readline.c > *** Python-cvs/Modules/readline.c Sat Feb 5 09:00:14 2000 > --- Python-gc/Modules/readline.c Sat Feb 5 09:00:47 2000 > *************** > *** 41,46 **** > --- 41,63 ---- > extern int (*PyOS_InputHook)(); > extern char *(*PyOS_ReadlineFunctionPointer) Py_PROTO((char *)); > > + /* convert a string allocated by malloc to one allocated by PyMem_MALLOC */ > + static char * > + malloc2PyMem(char *p) > + { > + char *py; > + int n; > + if (p == NULL) { > + return p; > + } else { > + n = strlen(p); > + py = PyMem_MALLOC(n); > + strncpy(py, p, n+1); > + free(p); > + return py; > + } > + } > + All in all, this patch isn't acceptable for me and I suggest that we prepare another one which respects "the rule", after demystifying some of the issues mentioned above. Note that introducing PyMem_MALLOC would constitute an additional constraint for C Python coders, who're used to malloc/free. In order to pave the way for alternatives to libc's malloc, we must propose a clean solution. Hence it would be good to collect as much opinions/reactions on the subject as possible and settle on an interface which would be convenient for everybody. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4