On Tue, 28 Apr 2009 at 20:29, Glenn Linderman wrote: > On approximately 4/28/2009 7:40 PM, came the following characters from the > keyboard of R. David Murray: >> On Tue, 28 Apr 2009 at 13:37, Glenn Linderman wrote: >> > C. File on disk with the invalid surrogate code, accessed via the str >> > interface, no decoding happens, matches in memory the file on disk with >> > the byte that translates to the same surrogate, accessed via the bytes >> > interface. Ambiguity. >> >> Unless I'm missing something, one of these is type str, and the other is >> type bytes, so no ambiguity. > > > You are missing that the bytes value would get decoded to a str; thus both > are str; so ambiguity is possible. Only if you as the programmer decode it. Now, I don't understand the subtleties of Unicode enough to know if Martin has already successfully addressed this concern in another fashion, but personally I think that if you as a programmer are comparing funnydecoded-str strings gotten via a string interface with normal-decoded strings gotten via a bytes interface, that we could claim that your program has a bug. --David
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4