On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote: > I don't think you can't claim, though, that Python 3 makes things > significantly harder for these frameworks. The proof is that many of > them already give the user unicode strings in Python 2.x. They must > have somehow got the decoding right. Well... Frameworks usually 'simplify' the problem by partly ignoring it. By default they assume the data in the request in UTF-8. You can specify an alternative encoding in most of them. Django [1], Werkzeug [2], and WebOb [3] do that. The problem with this approach is that you still have to deal with weird requests where one thing is unicode, and another is latin-1. Sometime you can even have 2 different encodings in a single header like Cookies. There's no solution to this problem, it has to be solved on a case by case basis. There was a big discussion a while ago on web-sig. I think the consensus was that WSGI for Python 3 should assume that the data is encoded in latin-1 since it's the default encoding according to the RFC. [1] http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.encoding [2] http://werkzeug.pocoo.org/documentation/dev/unicode.html#request-and-response-objects [3] http://pythonpaste.org/webob/reference.html#unicode-variables -- Henry PrĂȘcheur
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4