On Sep 26, 2007, at 1:30 PM, Jacques Distler wrote: > On Sep 15, 9:18 pm, Shawn <[EMAIL PROTECTED]> wrote: >> The data structure (list) used to keep unread characters has O(n) >> cost >> for insertion to/deletion from the head of the list, so frequent >> unget() (which happens when parsing a long sequence of number >> entities) can result in poor performance. Changing the data >> structure >> to deque, a queue object with O(1) insertion/deletion cost at the >> head, improves the performance in general and makes char()/unget() >> scales better with long input size. > > I imagine there's a similar O(n) issue with the shift/unshift methods > of Ruby's Array class. > > Is there a Ruby Deque class floating around somewhere, which would > allow a similar speedup of inputstream.rb?
I'll look into it. FWIW, there's several other significantly bad things in input_stream.rb related to performance, which I should be tackling soon.* -ryan * soon, because we're starting to use the ruby port of html5lib (in limited ways) in production at Technorati. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send email to html5lib-discuss@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB -~----------~----~----~----~------~----~------~--~---
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4