A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2005-April/052524.html below:

[Python-Dev] Unicode byte order mark decoding

[Python-Dev] Unicode byte order mark decoding [Python-Dev] Unicode byte order mark decoding"Martin v. Löwis" martin at v.loewis.de
Tue Apr 5 10:03:15 CEST 2005
Stephen J. Turnbull wrote:
> So there is a standard for the UTF-8 signature, and I know of
> applications which produce it.  While I agree with you that Python's
> codecs shouldn't produce it (by default), providing an option to strip
> is a good idea.

I would personally like to see an "utf-8-bom" codec (perhaps better
named "utf-8-sig", which strips the BOM on reading (if present)
and generates it on writing.

> However, this option should be part of the initialization of an IO
> stream which produces Unicodes, _not_ an operation on arbitrary
> internal strings (whether raw or Unicode).

With the UTF-8-SIG codec, it would apply to all operation modes of
the codec, whether stream-based or from strings. Whether or not to
use the codec would be the application's choice.

Regards,
Martin
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4