A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2010-January/097157.html below:

[Python-Dev] Quick sum up about open() + BOM

[Python-Dev] Quick sum up about open() + BOMVictor Stinner victor.stinner at haypocalc.com
Sat Jan 9 13:54:17 CET 2010
Le samedi 09 janvier 2010 01:47:38, vous avez écrit :
> One concern I have with this implementation encoding="BOM" is that if
> there is no BOM it assumes UTF-8.

If no BOM is found, it fallback to the current heuristic: os.device_encoding() 
or system local.

> (...) Hence, it might be that someone would expect a UTF-16LE (or any of 
> the formats that don't require a BOM, rather than UTF-8), but be willing 
> to accept any BOM-discriminated format.
> (...) declare that they will accept
> any BOM-discriminated format, but want to default, in the absence of a
> BOM, to the original national language locale that they historically
> accepted

You mean "if there is a BOM, use it, otherwise fallback to a specific 
charset"? How could it be declared? Maybe:

   open("file.txt", check_bom=True, encoding="UTF16-LE")
   open("file.txt", check_bom=True, encoding="latin1")

About falling back to UTF-8, it would be written:

   open("file.txt", check_bom=True, encoding="UTF-8")

As explained before, check_bom=True is only accepted for read only file mode.

Well, why not. This is a third choice for my point (1) :-) It's between Guido 
and Antoine choice, and I like it because we can fallback to UTF-8 instead of 
the dummy system locale: Windows users will be happy to be able to use UTF-8 
:-) I prefer to fallback to a fixed encoding then depending on the system 
locale.

-- 
Victor Stinner
http://www.haypocalc.com/

More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4