M.-A. Lemburg writes: > The abbreviation BOM is quite common w/r to Unicode. Yes: "w/r to Unicode". In sys, it's out of context and should receive a more descriptive name. I think using BOM in unicodec is good. > BOM_BE: '\376\377' > (corresponds to Unicode 0x0000FEFF in UTF-16 > == ZERO WIDTH NO-BREAK SPACE) I'd also add BOM to be the same as sys.byte_order_mark. Perhaps even instead of sys.byte_order_mark (just to localize the areas of code that are affected). > Note that Unicode sees big endian byte order as being "correct". The A lot of us do. ;-) -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4