Edward Z. Yang wrote: > Hello all, > > While I'm all for using appropriate language features to make > implementations more readable and efficient, the use of the "yield" > keyword in the tokenizer makes me pause a little. Why should we not use > yield? > > * It complicates porting the Python implementation to other languages. > In my opinion, the html5lib implementations are a first step on the way > to a uber-fast C implementation and other userspace implementations in > other programming languages. Coroutines are mildly difficult to > understand; using a more general implementation helps porters! (like me) > > * It's not necessary. We can remove the yield statement in two ways: > replacing the parser's iteration as a callback function, or queuing > extra tokens on a round of self.state(). Both are simple changes due to > the minimal amounts of state that need to be preserved; heck, there's > already a tokenQueue data-structure to push the stream errors to.
I don't think either of these points are very serious; porting html5lib directly to C is not trivial anyway; as you say yourself, replacing the use of yield would be easy. Despite this, I don't think we should hobble the python implementation due to the deficiencies of other languages. Unless there is a real win in readability or performance from changing the current code, I think we should leave it alone. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4