Now I feel it is bad thing that encouraging using unicode for binary with latin-1 encoding or surrogateescape errorhandler. Handling binary data in str type using latin-1 is just a hack. Surrogateescape is just a workaround to keep undecodable bytes in text. Encouraging binary data in str type with latin-1 or surrogateescape means encourage mixing binary and text data. It is worth than Python 2. So Python should encourage handling binary data in bytes type. On Fri, Jan 10, 2014 at 11:28 PM, Matěj Cepl <matej at ceplovi.cz> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 2014-01-10, 12:19 GMT, you wrote: > > Using the 'latin-1' to mean unknown encoding can easily result > > in Mojibake (unreadable text) entering your application with > > dangerous effects on your other text data. > > > > E.g. "Marc-André" read using 'latin-1' if the string itself > > is encoded as UTF-8 will give you "Marc-André" in your > > application. (Yes, I see that a lot in applications > > and websites I use ;-)) > > I am afraid that for most 'latin-1' is just another attempt to > make Unicode complexity go away and the way how to ignore it. > > Matěj > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.22 (GNU/Linux) > > iD8DBQFS0AOG4J/vJdlkhKwRAgffAKCHn8uMnpZDVSwa2Oat+QI2h32o2wCeJdUN > ZXTbDtiJtJrrhnRPzbgc3dc= > =Pr1X > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com > -- INADA Naoki <songofacandy at gmail.com> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140111/de7a87bb/attachment.html>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4