On Wed, 3 May 2000, Tim Peters wrote: [Moshe Zadka] > ... > I'd much prefer Python to reflect a fundamental truth about Unicode, > which at least makes sure binary-goop can pass through Unicode and > remain unharmed, then to reflect a nasty problem with UTF-8 (not > everything is legal). [Tim Peters] > Then you don't want Unicode at all, Moshe. All the official encoding > schemes for Unicode 3.0 suffer illegal byte sequences Of course I don't, and of course you're right. But what I do want is for my binary goop to pass unharmed through the evil Unicode forest. Which is why I don't want it to interpret my goop as a sequence of bytes it tries to decode, but I want the numeric values of my bytes to pass through to Unicode uharmed -- that means Latin-1 because of the second design decision of the horribly western-specific unicdoe - the first 256 characters are the same as Latin-1. If it were up to me, I'd use Latin-3, but it wasn't, so it's not. > (for example, 0xffff > is illegal in UTF-16 (whether BE or LE) Tim, one of us must have cracked a chip. 0xffff is the same in BE and LE -- isn't it. -- Moshe Zadka <moshez@math.huji.ac.il> http://www.oreilly.com/news/prescod_0300.html http://www.linux.org.il -- we put the penguin in .com
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4