I've attached a patch to Issue #42, as a first step towards real streaming of input.
It changes the HTMLInputStream.dataStream into a codecs.StreamReader and uses the StreamReader's read() method in HTMLInputStream.char(). I also totally refactored position() computing (there's no more 'tell' variable, though I probably could have kept it). Actually, I haven't understood how 'tell' was managed exactly (particularly in charsUntil()) Maybe this has to do with conversion from \r into \n? Given that 'tell' is only used internally (html5parser only uses position()) I told above this was a first step, because you still cannot use a non-seekable stream if you rely on encoding detection (which still uses seek()) Maybe I should make a branch in the repository? or are you OK to commit the patch in the trunk? -- Thomas Broyer --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4