A bug was reported against the csv module, claiming (rightly so) that the csv module was not properly parsing files which use Mac line endings. I tracked the problem down to an apparent defiency in readahead_get_line_skip() in fileobject.c. It believes that only \n can terminate a line. The patch below fixes my csv module problem, but I wonder if it's the correct fix. Suppose you're using Mac line endings and encounter a \n before a \r? This function will return a too-short line. (Of course, it would without the patch as well.) I don't know how (or if) this should work with universal newline support. We expect files to be opened in binary mode, so I don't know if universal newline support applies. In short, does this look like the correct patch, closer to the correct behavior than the current setup, or no improvement at all? Thx, Skip cvs diff fileobject.c Index: fileobject.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v retrieving revision 2.179 diff -c -r2.179 fileobject.c *** fileobject.c 18 May 2003 12:56:25 -0000 2.179 --- fileobject.c 18 Aug 2003 03:13:38 -0000 *************** *** 1803,1810 **** return (PyStringObject *) PyString_FromStringAndSize(NULL, skip); bufptr = memchr(f->f_bufptr, '\n', len); if (bufptr != NULL) { ! bufptr++; /* Count the '\n' */ len = bufptr - f->f_bufptr; s = (PyStringObject *) PyString_FromStringAndSize(NULL, skip+len); --- 1803,1812 ---- return (PyStringObject *) PyString_FromStringAndSize(NULL, skip); bufptr = memchr(f->f_bufptr, '\n', len); + if (bufptr == NULL) + bufptr = memchr(f->f_bufptr, '\r', len); if (bufptr != NULL) { ! bufptr++; /* Count the '\n' or '\r' */ len = bufptr - f->f_bufptr; s = (PyStringObject *) PyString_FromStringAndSize(NULL, skip+len);
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4