Andrew McNamara wrote: >>>Andrew McNamara wrote: >>> >>>>There's a bunch of jobs we (CSV module maintainers) have been putting >>>>off - attached is a list (in no particular order): >>>>* unicode support (this will probably uglify the code considerably). >>> >>Martin v. Löwis wrote: >> >>>Can you please elaborate on that? What needs to be done, and how is >>>that going to be done? It might be possible to avoid considerable >>>uglification. > > > I'm not altogether sure there. The parsing state machine is all written in > C, and deals with signed chars - I expect we'll need two versions of that > (or one version that's compiled twice using pre-processor macros). Quite > a large job. Suggestions gratefully received. > > M.-A. Lemburg wrote: > >>Indeed. The trick is to convert to Unicode early and to use Unicode >>literals instead of string literals in the code. > > > Yes, although it would be nice to also retain the 8-bit versions as well. You can do so by using latin-1 as default encoding. Works great ! >>Note that the only real-life Unicode format in use is UTF-16 >>(with BOM mark) written by Excel. Note that there's no standard >>for specifying the encoding in CSV files, so this is also the only >>feasable format. > > Yes - that's part of the problem I hadn't really thought about yet - the > csv module currently interacts directly with files as iterators, but it's > clear that we'll need to decode as we go. Depends on your needs: CSV files tend to be small enough to do the decoding in one call in memory. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4