On Fri, Jan 10, 2014 at 2:03 AM, Joao S. O. Bueno <jsbueno at python.org.br> wrote: > On 9 January 2014 04:50, Lennart Regebro <regebro at gmail.com> wrote: >> To be honest, you can define text as "A stream of bytes that are split >> up in lines separated by a linefeed", and do some basic text >> processing like that. Just very *basic*, but still. Replacing >> characters. Extracting certain lines etc. > > That is, until you hit a character which has a byte with the same > value of ASCII newline in the middle of a multi-byte character. > > So, this approach is broken to start with. For a very specific definition of broken, yes, namely that it will fail with UTF-16 or EBCDIC. Files that with the above definition of "text files" are not text files. :-) //Lennart
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4