Bugs item #493183, was opened at 2001-12-13 21:02 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=493183&group_id=5470 Category: Python Interpreter Core Group: Platform-specific Status: Open Resolution: None Priority: 5 Submitted By: Jason R. Mastaler (jasonrm) Assigned to: Nobody/Anonymous (nobody) Summary: FreeBSD/Python 2.1.1 Segmentation fault Initial Comment: This is Python 2.1.1 on FreeBSD 4.4-RELEASE/i386. I have a reproducible way to make Python 2.1.1 segfault here, but I'm not sure the best way to go about debugging it. I'll save you the gory details unless you request them, but basically I use a Python program called "tmda-inject" to send my e-mail. It's core dumping when fed this one particular message (mess1.txt), but not others. Further, Python 1.5.2, 2.0 and 2.2b2 on the same machine have no such trouble. It only appears with Python 2.1.1. I can also reproduce this problem a complately different machine also running FreeBSD 4.4 and Python 2.1.1, but not with other operating systems. The problem appears to be specifically with FreeBSD and Python 2.1.1. I've attached the "mess1.txt" e-mail message, and also python2.1.core. If you'd like to look at tmda-inject, it's part of my TMDA package available at http://sf.net/projects/tmda/. Let me know if I can do anything further. [jasonrm@nightshade jasonrm]$ cat mess1.txt | python2.1 cvs/tmda/dist/bin/tmda-inject Segmentation fault (core dumped) [jasonrm@nightshade jasonrm]$ cat mess1.txt | python1.5 cvs/tmda/dist/bin/tmda-inject [jasonrm@nightshade jasonrm]$ cat mess1.txt | python2.0 cvs/tmda/dist/bin/tmda-inject [jasonrm@nightshade jasonrm]$ cat mess1.txt | python2.2 cvs/tmda/dist/bin/tmda-inject [jasonrm@nightshade jasonrm]$ gdb /usr/local/bin/python2.1 python2.1.core GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... Core was generated by `python2.1'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libutil.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libm.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libc_r.so.4...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/python2.1/lib-dynload/strop.so...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/python2.1/lib-dynload/time.so...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/python2.1/lib-dynload/binascii.so...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/python2.1/lib-dynload/sha.so...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/python2.1/site-packages/cdbmodule.so...(no debugging symbols found)...done. Reading symbols from /usr/libexec/ld-elf.so.1...(no debugging symbols found)...done. #0 0x28177e0e in _thread_sys_fsync () from /usr/lib/libc_r.so.4 (gdb) where #0 0x28177e0e in _thread_sys_fsync () from /usr/lib/libc_r.so.4 #1 0x2818ea24 in _thread_autoinit_dummy_decl () from /usr/lib/libc_r.so.4 #2 0x28179222 in vfprintf () from /usr/lib/libc_r.so.4 #3 0x28169f75 in sprintf () from /usr/lib/libc_r.so.4 #4 0x80934ae in _PyString_FormatLong () #5 0x8093d48 in PyString_Format () #6 0x807b868 in PyNumber_Remainder () #7 0x80556a2 in PyEval_EvalCode () #8 0x8058d95 in PyEval_CallObjectWithKeywords () #9 0x80574cd in PyEval_EvalCode () #10 0x8058d95 in PyEval_CallObjectWithKeywords () #11 0x80574cd in PyEval_EvalCode () #12 0x8058d95 in PyEval_CallObjectWithKeywords () #13 0x80574cd in PyEval_EvalCode () #14 0x8058d95 in PyEval_CallObjectWithKeywords () #15 0x80574cd in PyEval_EvalCode () #16 0x8058d95 in PyEval_CallObjectWithKeywords () #17 0x80574cd in PyEval_EvalCode () #18 0x8058d95 in PyEval_CallObjectWithKeywords () #19 0x80574cd in PyEval_EvalCode () #20 0x80548a8 in PyEval_EvalCode () #21 0x806d75d in PyRun_FileExFlags () #22 0x806d70e in PyRun_FileExFlags () #23 0x806d6de in PyRun_FileExFlags () #24 0x806cab0 in PyRun_SimpleFileExFlags () #25 0x806c5b0 in PyRun_AnyFileExFlags () #26 0x8051795 in Py_Main () #27 0x80511c2 in main () #28 0x8051111 in _start () (gdb) ---------------------------------------------------------------------- >Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-17 14:07 Message: Logged In: YES user_id=85984 Your prediction is correct: [root@nightshade Python-2.1.1]# ./python df Python 2.1.1 (#1, Dec 17 2001, 15:01:24) [GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4 Type "copyright", "credits" or "license" for more information. >>> ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-16 21:13 Message: Logged In: YES user_id=6380 Add these three lines to Py_Main() in Modules/main.c, after the line saying "PyCompilerFlags cf;": char buffer[120]; sprintf(buffer, "%.1lx", 223L); printf("%s\n", buffer); and rebuild Python. When you run Python, it should print "df" before the standard banner. If it crashes, we've proved that Python is not the cause of the problem. But I predict that it'll work. :-( ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-16 21:01 Message: Logged In: YES user_id=85984 As far as the standalone program, I'm not a C programmer, (I've become Python-dependent :-), but if you can give me an idea of what it should contain, I'll give it a try. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-16 20:23 Message: Logged In: YES user_id=6380 Sigh. I think it's a dead end. What about a stand-alone program? ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-16 19:09 Message: Logged In: YES user_id=85984 Tim asked whather I can verify whether I'm mixing the threaded and non-threaded versions of libc. Is the following enough to tell? $ ldd /usr/local/src/Python-2.1.1/python /usr/local/src/Python-2.1.1/python: libc_r.so.4 => /usr/lib/libc_r.so.4 (0x280ca000) libutil.so.3 => /usr/lib/libutil.so.3 (0x28182000) libm.so.2 => /usr/lib/libm.so.2 (0x2818b000) libc.so.4 => /usr/lib/libc.so.4 (0x281a7000) libc is the non-threaded library, and libc_r is the threaded version, so it looks like they are being mixed, yes? However, my Python 2.0.1 also looks the same way, and I'm not having any problems with it. $ ldd /usr/local/src/Python-2.0.1/python /usr/local/src/Python-2.0.1/python: libc_r.so.4 => /usr/lib/libc_r.so.4 (0x2816c000) libutil.so.3 => /usr/lib/libutil.so.3 (0x28224000) libm.so.2 => /usr/lib/libm.so.2 (0x2822d000) libc.so.4 => /usr/lib/libc.so.4 (0x28249000) Next, Guido asked me to check some local variables in that stack frame. See attached backtrace3.txt. And to answer your question, yes I built all the Python versions I'm testing myself. I even rebuilt 2.0.1 and 2.1.1 in the simplest manner to make sure I'm comparing them equally. (Just ./configure && make). ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-12-16 17:47 Message: Logged In: YES user_id=31435 BTW, a google search on _thread_autoinit_dummy_decl turns up another FreeBSD bug report of a death deep in libc_r starting with a sprintf() call. It had to do with mysql, not with Python. The tracebacks in both cases look worthless to me, because (for example) _thread_autoinit_dummy_decl is an extern int in the FreeBSD source, not a function at all. This increases the suspicion that libc versions are getting mixed up (and that gdb is looking up addresses in "the wrong" libc). ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-16 17:20 Message: Logged In: YES user_id=6380 OK, it's the second sprintf call in formatint. Can you check some local variables in that stack frame? I expect fmt to be "%.1lx", but would like to see that confirmed; x should be one of the values from your binary string: 206, 199, 166. Can you see which one? It would then be interesting to find out what a small C program with that same sprintf() does. Make sure that the C program is compiled and linked in exactly the same way as python. (What I do in such cases is I put the experimental statement in Python's main() and rebuild Python. :-) The build procedure of each Python version is somewhat different, and it's likely that each version picks slightly different compile/link options. Also, did you build all the Python versions you tried yourself? ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2001-12-16 17:17 Message: Logged In: YES user_id=31435 The new backtrace is helpful (thanks!), but it just makes it harder to believe that Python is at fault. The segfault occurs four levels deep in system library functions, and from inspection there's nothing even remotely suspicious about the sprintf call that kicks it off. As Guido suggested, can you verify you're not mixing the threaded and non-threaded versions of libc? Short of that, you really need a FreeBSD libc author to help with this. ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-16 15:26 Message: Logged In: YES user_id=85984 Alright, I rebuilt Python 2.1.1 with -g, so the gdb backtrace now contains some more information. I attached backtrace2.txt which is a backtrace of the program failing when calling the hexlify() function. I tried compiling with `--with-pydebug', but then I can't reproduce the core dump for some reason. As you mention, this could have something to do with FreeBSD's libc. However, does that explain why I only have this problem with 2.1.1? On the same system, Python 1.5.2, 2.0, and 2.2c1 compiled in the same manner (simply ./configure && gmake) don't produce a core dump. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-16 11:24 Message: Logged In: YES user_id=6380 See the repeated "(no debugging symbols found)..." in the gdb output? try recompiling with -g. I strongly suspect that the traceback is lying, and we're in fact not in PyString_AsStringAndSize() when we call sprintf -- there's no sprintf call in that function (I checked the 2.1.1 source) but there is one two static functions down, in string_repr(), which is indeed called when you pickle a string object. This would match the original gdb stacktrace you reported: there's no sprintf call in _PyString_FormatLong(), but there are two in the next static function, formatint(), and it does make sense that that function would be called by the %02x format. All this makes no sense, however, because I can see nothing wrong with the arguments to sprintf here! I've heard of problems on FreeBSD having to do with different (threaded and unthreaded?) versions of libc. Could that be the issue here? Can you rebuild Python 2.1.1 with debugg flags on? (./configure --with-pydebug should do it for 2.2; I don't recall if the 2.2.1 configure supports that though -- try --help first.) ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-15 23:49 Message: Logged In: YES user_id=85984 The argument to hexlify() is 3 bytes of raw binary. It's simply an HMAC of '%d' % time.time() stored in binary. Any suggestions on how to visualize, analyze, or store this argument? There is something about it that Python really doesn't like. I even tried to pickle it, and that too made Python core dump (see attached backtrace_pickle.txt). Strangely, binascii.hexlify has no problem handling it. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-15 17:18 Message: Logged In: YES user_id=6380 Very strange. You could try setting a breakpoint in PyString_Format. But I would first try to boil down the program to a smaller example. For example, figure out what the argument to hexlify() is, and see if you get the same crash when you call hexlify() on its own with that argument. I find it curious why the core dump is in _PyString_FormatLong(), since hexlify doesn't pass longs to the % operator -- it passes a tuple of small ints. ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-15 16:50 Message: Logged In: YES user_id=85984 BTW, if you need access to a FreeBSD machine, Sourceforge's "Compile Farm" has one available. ssh to `cf.sourceforge.net' ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-15 16:47 Message: Logged In: YES user_id=85984 Alright, through print statements, I've been able to pinpoint exactly where the program is failing causing Python to core dump. It is when the following function is called: def hexlify(b): return "%02x"*len(b) % tuple(map(ord, b)) Ironically, I got this function from you Guido. :-) (http://groups.google.com/groups?hl=en&selm=3924E204.CE01CA0A%40python.org) I only have to replace the function body with: return "foo" and the program no longer core dumps. The difficult thing has been trying to reproduce this problem with a smaller piece of code. I've been trying for hours and can't do it. As I said in my original message, I've never had this problem before this particular message. Even though its contents look harmless, there is something there that is triggering this problem. It seems pretty complex to me. However, I can run Python from inside gdb and get the program to crash from there. I changed the program to open "mess1.txt" directly. Not being experienced with gdb though, I'm not sure what kind of "digging around" to do. Could you give me some direction here? I've attached "backtrace1.txt" as a starting point. ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2001-12-13 21:25 Message: Logged In: YES user_id=6380 Thanks for all the info, but I'm afraid you're going to have to work harder. We don't have FreeBSD here, so your core file won't do us any good. Can you use gdb to get a traceback and do some more digging? (It would be best not to do this from the core file, but to run the program under gdb until it crashes -- that way the stacktrace is more reliable and you can do more digging around.) Also, it would be really helpful if you could boil this down to a smaller piece of Python code. Maybe you can put print statements in your Python program that give you an idea of which operation is failing. As it is, the problem is too hard to reproduce to be worth debugging. The sample message contains nothing objectionable, so I think this must be a matter of something the program does that reads or write a bad pointer, where the value of the bad pointer depends on what was previously on the stack or what is in some spot of the heap. ---------------------------------------------------------------------- Comment By: Jason R. Mastaler (jasonrm) Date: 2001-12-13 21:13 Message: Logged In: YES user_id=85984 Oops, the core file was too large to upload (1.5MB). If you want to see it, you can download it at: http://www.mastaler.com/tmp/2001-12-13/python2.1.core ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=493183&group_id=5470
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4