On Tue, Jul 17, 2012 at 2:57 PM, John O'Connor <jxo6948 at rit.edu> wrote: >> >> The second approach is consistently 10-20% faster than the first one >> (depending on input) for trunk Python 3.3 >> > > I think the difference is that StringIO spends extra time reallocating > memory during the write loop as it grows, whereas bytes.join computes > the allocation size first since it already knows the final length. BytesIO is actually missing an optimisation that is already used in StringIO: the StringIO C implementation uses a fragment accumulator internally, and collapses that into a single string object when getvalue() is called. BytesIO is still using the old "resize-the-buffer-as-you-go" strategy, and thus ends up repeatedly reallocating the buffer as the data sequence grows incrementally. It should be optimised to work the same way StringIO does (which is effectively the same way that the monkeypatched version works) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4