Fredrik bug report made me dive a little deeper into compares and contains tests. Here is a snapshot of what my current version does: >>> '1' == None 0 >>> u'1' == None 0 >>> '1' == 'aäöü' 0 >>> u'1' == 'aäöü' Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeError: UTF-8 decoding error: invalid data >>> '1' in ('a', None, 1) 0 >>> u'1' in ('a', None, 1) 0 >>> '1' in (u'aäöü', None, 1) 0 >>> u'1' in ('aäöü', None, 1) Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeError: UTF-8 decoding error: invalid data The decoding errors occur because 'aäöü' is not a valid UTF-8 string (Unicode comparisons coerce both arguments to Unicode by interpreting normal strings as UTF-8 encodings of Unicode). Question: is this behaviour acceptable or should I go even further and mask decoding errors during compares and contains tests too ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4