Andrew McNamara wrote: >>>Yes, although it would be nice to also retain the 8-bit versions as well. >> >>You can do so by using latin-1 as default encoding. Works great ! > > Yep, although that means we wear the cost of decoding and encoding for > all 8 bit input. Right, but it makes the code very clean and straight forward. Again, it depends on what you need. If performance is critical then you probably need a C version written using the same trick as _sre.c... > What does the _sre.c code do? It comes in two versions: one for 8-bit the other for Unicode. >>Depends on your needs: CSV files tend to be small enough >>to do the decoding in one call in memory. > > We are routinely dealing with multi-gigabyte csv files - which is why the > original 2001 vintage csv module was written as a C state machine. I see, but are you sure that the typical Python user will have the same requirements to make it worth the effort (and complexity) ? I've written a few CSV parsers and writers myself over the years and the requirements were different every time, in terms of being flexible in the parsing phase, the interfaces and the performance needs. Haven't yet found a one fits all solution and don't really expect to any more :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4