This is a vague list of things I plan on doing, as I want to get the tokenizer more or less finished before I move on to tree-building.
- Split out the input stream from the tokenizer (this is more work than it sounds because I also want to move to properly consuming characters as we tokenize, and never unconsuming more than one character; this is primarily so we can also allow $fp = fopen('http://example.com ') to be used as an input stream in the future without so much work). This is what I'm currently working on. - Remove cases where we directly call states (as we can now unconsume), to more closely match the spec, and to make the next item at least possible. - Investigate moving the whole state-machine to one big switch statement within the parse method (this shouldn't make the codebase messy, IMO, and it avoids the function call overhead, which is currently a non-negligible amount of our expense). - Do more grabbing multiple characters. -- Geoffrey Sneddon <http://gsnedders.com/> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4