Martin v. Löwis wrote: > Walter Dörwald wrote: > >> The register command in 2.4 (and current CVS) simply does a >> value = str(value) >> in post_to_server() so the encoded bytes sent depend on the >> default encoding. Would it be sufficient to change this to >> value = unicode(value).encode("utf-8") > > Indeed. I think this can go into 2.4.2. OK, I've checked this into HEAD and release24-maint (including the change to the Content-Type header). >> Another solution might be to include the encoding in the Content-type >> header of the request. IMHO the best solution would be to do both: >> Always use UTF-8 as the encoding and include this in the Content-type >> header in the request. PyPI should honor this encoding when it finds >> it and should fall back to whatever it used before if it doesn't. > > Yeah, well :-) Content-type in form upload is a mess, as you certainly > know. It should be honored, but commonly isn't. This, in turn, causes > browsers to ignore it. Fortunately we have both ends under control (except for old Python versions). > PyPI uses the CGI module. It currently decodes anything that doesn't > have a filename attribute to UTF-8, causing rejection of anything > that doesn't send UTF-8. This could be fixed/extended, but I think that > would be best done in the CGI module, for consumption by any application > that uses form upload. For example, doing > > cgi.FieldStorage(..., encoding="UTF-8") > > should cause > > a) decoding of every field that has an encoding= in its content > type > b) decoding of every field that is not a file to UTF-8. It is a > file if it > I) has a filename, or > II) cannot be decoded to the target decoding > > For backwards compatibility, a) can only be enabled if the CGI > application explicitly tells what encoding it expects. > > I'd like to state "contributions are welcome", although others > may think differently. OK, I'll see, if I can give this a try. Bye, Walter Dörwald
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4