RetroSearch Browse

Fri Jan 8 10:10:23 CET 2010 · http://mail.python.org/pipermail/python-dev/2010-January/097115.html

> Builtin open() function is unable to open an UTF-16/32 file starting with a 
> BOM if the encoding is not specified (raise an unicode error). For an UTF-8 
> file starting with a BOM, read()/readline() returns also the BOM whereas the 
> BOM should be "ignored".

It depends. If you use the utf-8-sig encoding, it *will* ignore the
UTF-8 signature.

> Since my proposition changes the result TextIOWrapper.read()/readline() for 
> files starting with a BOM, we might introduce an option to open() to enable 
> the new behaviour. But is it really needed to keep the backward compatibility?

Absolutely. And there is no need to produce a new option, but instead
use the existing options: define an encoding that auto-detects the
encoding from the family of BOMs. Maybe you call it encoding="sniff".

Regards,
Martin

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2010-January/097115.html below:

[Python-Dev] Improve open() to support reading file starting with an unicode BOM