At 8:30 AM -0400 02-05-2000, Guido van Rossum wrote: >I think /F's point was that the Unicode standard prescribes different >behavior here: for UTF-8, a missing or lone continuation byte is an >error; for Unicode, accents are separate characters that may be >inserted and deleted in a string but whose display is undefined under >certain conditions. > >(I just noticed that this doesn't work in Tkinter but it does work in >wish. Strange.) > >> FYI: Normalization is needed to make comparing Unicode >> strings robust, e.g. u"=C8" should compare equal to u"e\u0301". > >Aha, then we'll see u =3D=3D v even though type(u) is type(v) and len(u) >!=3D len(v). /F's world will collapse. :-) Does the Unicode spec *really* specifies u should compare equal to v? This behavior would be the responsibility of a layout engine, a role which is way beyond the scope of Unicode support in Python, as it is language- and script-dependent. Just
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4