-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Guido van Rossum wrote: > On Thu, Jan 7, 2010 at 10:12 PM, Tres Seaver <tseaver at palladion.com> wrote: >> The BOM should not be seekeable if the file is opened with the proposed >> "guess encoding from BOM" mode: it isn't properly part of the stream at >> all in that case. > > This feels about right to me. There are still questions though: > immediately after opening a file with a BOM, what should .tell() > return? And regardless of that, .seek(0) should put the file in that > same initial state. I think the behavior should be something like: >>> f = open('/path/to/maybe-BOM-encoded-file', 'r', encoding='BOM') >>> f.tell() 0L >>> f.seek(-1) >>> f.tell() # count of unicode chars in decoded stream 45L >>> f.seek(0) >>> f.read(1) # read first unicode char decoded from stream. 'A' In other words, the BOM is not readable / seekable at all: it is invisible to the consumer of the decoded stream. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktHnyIACgkQ+gerLs4ltQ6s3QCgznD+7FbUzfCbe5TS6OcoXjMg rdgAoJAMEXe2xwLCIwJaZ6XA6rVyTIAi =oXb3 -----END PGP SIGNATURE-----
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4