Stephen J. Turnbull wrote: > So there is a standard for the UTF-8 signature, and I know of > applications which produce it. While I agree with you that Python's > codecs shouldn't produce it (by default), providing an option to strip > is a good idea. I would personally like to see an "utf-8-bom" codec (perhaps better named "utf-8-sig", which strips the BOM on reading (if present) and generates it on writing. > However, this option should be part of the initialization of an IO > stream which produces Unicodes, _not_ an operation on arbitrary > internal strings (whether raw or Unicode). With the UTF-8-SIG codec, it would apply to all operation modes of the codec, whether stream-based or from strings. Whether or not to use the codec would be the application's choice. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4