A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2002-March/021540.html below:

[Python-Dev] Debug entry points for PyMalloc

[Python-Dev] Debug entry points for PyMallocTim Peters tim.one@comcast.net
Thu, 21 Mar 2002 01:18:24 -0500
The thing I've dreaded most about switching to pymalloc is losing the
invaluable memory-corruption clues supplied by the Microsoft debug-build
malloc.  On more than one occasion, they've found wild stores, out-of-bounds
reads, reads of uninitialized memory, and reads of free()ed memory in
Python.  It does this by spraying special bytes all over malloc'ed memory at
various times, then checking the bytes for sanity at free() and realloc()
times.

This kind of stuff is going to be pure hell under pymalloc, because there's
no padding at all between chunks pymalloc passes out, and pymalloc stores
valid addresses at the start of free()ed memory.  So a wild store probably
can't be detected as any sort of memory corruption, it will simply overwrite
part of some other end-user object -- or corrupt pymalloc's internal
pointers linking free()ed memory (and pymalloc simply goes nuts then).
Several months ago Martin and I took turns thinking about a memory overwrite
problem in the Unicode stuff that showed up under pymalloc, and it was
indeed pure hell to track it down.

Following is a sketch for teaching pymalloc how to do something similar to
the MS scheme.  A twist over the MS scheme is adding a "serial number" to
the pad bytes, incremented by one for each malloc/realloc.  At a crude
level, this gives a sense of age to the eyeball; for reproducible memory
nightmares, it gives an exact way to set a data, or counting, breakpoint (on
the next run) to capture the instant at which a doomed-to-go-bad memory
block first gets passed out.  I hope that addresses the worst problem the MS
scheme still leaves untouched:  you can catch memory corruption pretty will
with it, but all you know then is that "the byte at this address is bad",
and you have no idea what the memory's original purpose in life was.

Sketch of Debug Mode for PyMalloc

+ Three new entry points in obmalloc.c (note:  stop #include'ing this;
  hiding code in include files sucks, and naming an include file .c
  compounds the confusion):

DL_IMPORT(void *) _PyMalloc_DebugMalloc(size_t nbytes);
DL_IMPORT(void *) _PyMalloc_DebugRealloc(void *p, size_t nbytes);
DL_IMPORT(void) _PyMalloc_DebugFree(void *p);

+ When WITH_PYMALLOC and PYMALLOC_DEBUG are #define'd, these are
  mapped to in the obvious way from _PyMalloc_{MALLOC, REALLOC, FREE}:

#ifdef WITH_PYMALLOC
DL_IMPORT(void *) _PyMalloc_Malloc(size_t nbytes);
DL_IMPORT(void *) _PyMalloc_Realloc(void *p, size_t nbytes);
DL_IMPORT(void) _PyMalloc_Free(void *p);

DL_IMPORT(void *) _PyMalloc_DebugMalloc(size_t nbytes);
DL_IMPORT(void *) _PyMalloc_DebugRealloc(void *p, size_t nbytes);
DL_IMPORT(void) _PyMalloc_DebugFree(void *p);

#ifdef PYMALLOC_DEBUG
#define _PyMalloc_MALLOC _PyMalloc_DebugMalloc
#define _PyMalloc_REALLOC _PyMalloc_DebugRealloc
#define _PyMalloc_FREE _PyMalloc_DebugFree

#else   /* WITH_PYMALLOC && !PYMALLOC_DEBUG */
#define _PyMalloc_MALLOC _PyMalloc_Malloc
#define _PyMalloc_REALLOC _PyMalloc_Realloc
#define _PyMalloc_FREE _PyMalloc_Free

#endif  /* PYMALLOC_DEBUG */

#else   /* !WITH_PYMALLOC */
#define _PyMalloc_MALLOC PyMem_MALLOC
#define _PyMalloc_REALLOC PyMem_REALLOC
#define _PyMalloc_FREE PyMem_FREE
#endif  /* WITH_PYMALLOC */

+ A debug build implies PYMALLOC_DEBUG, but PYMALLOC_DEBUG can
  be forced in a release build.

+ No changes to the guts of _PyMalloc_{Malloc, Realloc, Free}.  Keep
  them as lean and as clear of #ifdef obscurity as they are now.

+ Define three special bit patterns.  In hex, they all end with B
  (for deBug <wink>), and begin with a vaguely mnemonic letter.
  Strings of these are unlikely to be legit memory addresses, ints,
  7-bit ASCII, or floats:

#define PYMALLOC_CLEANBYTE      0xCB    /* uninitialized memory */
#define PYMALLOC_DEADBYTE       0xDB    /* free()ed memory */
#define PYMALLOC_FORBIDDENBYTE  0xFB    /* unusable memory */

  The debug malloc/free/realloc use these as follows.  Note that this
  stuff is done regardless of whether PyMalloc handles the request
  directly or passes it on to the platform malloc (in fact, the debug
  entry points won't know and won't care).

+ The Debug malloc asks for 16 extra bytes and fills them with
  useful stuff:

  p[0:4]
      Number of bytes originally asked for.  4-byte unsigned integer,
      big-endian (easier to read in a memory dump).
  p[4:8]
      Copies of PYMALLOC_FORBIDDENBYTE.  Used to catch under- writes
      and reads.
  p[8:8+n]
      The requested memory, filled with copies of PYMALLOC_CLEANBYTE.
      Used to catch reference to uninitialized memory.
      &p[8] is returned.  Note that this is 8-byte aligned if PyMalloc
      handled the request itself.
  p[8+n:8+n+4]
      Copies of PYMALLOC_FORBIDDENBYTE.  Used to catch over- writes
      and reads.
  p[8+n+4:8+n+8]
      A serial number, from a PyMalloc file static, incremented by 1
      on each call to _PyMalloc_DebugMalloc and _PyMalloc_DebugRealloc.
      4-byte unsigned integer, big-endian.
      If "bad memory" is detected later, the serial number gives an
      excellent way to set a breakpoint on the next run, to capture the
      instant at which this block was passed out.

+ The Debug free first uses the address to find the number of bytes
  originally asked for, then checks the 8 bytes on each end for
  sanity (in particular, that the PYMALLOC_FORBIDDENBYTEs are still
  intact).
  XXX Make this checking a distinct entry point.
  XXX In case an error is found, print informative stuff, but then what?
  XXX Die or keep going?  Fatal error is probably best.
  Then fills the original N bytes with PYMALLOC_DEADBYTE.  This is to
  catch references to free()ed memory.  The forbidden bytes are left
  intact.
  Then calls _PyMalloc_Free.

+ The Debug realloc first calls _PyMalloc_DebugMalloc with the new
  request size.
  Then copies over the original bytes.
  The calls _PyMalloc_DebugFree on the original bytes.
  XXX This could, and probably should, be optimized to avoid copying
  XXX every time.




RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4