Tim Peters wrote: > [M.-A. Lemburg] > >>Hmm, you've now made PyUnicode_Join() to work with iterators >>whereas PyString_Join() only works for sequences. > > > They have both worked with iterators since the release in which > iterators were introduced. Nothing changed now in this respect. > > >>What are the performance implications of this for PyUnicode_Join() ? > > > None. > > >>Since the string and Unicode implementations have to be in sync, >>we'd also need to convert PyString_Join() to work on iterators. > > > It already does. I replied earlier this week on the same topic -- > maybe you didn't see that, or maybe you misunderstand what > PySequence_Fast does. Indeed. At the time Fredrik added this API, it was optimized for lists and tuples and had a fallback mechanism for arbitrary sequences. Didn't know that it now also works for iterators. Nice ! >>Which brings up the second question: >>What are the performance implications of this for PyString_Join() ? > > > None. > > >>The join operation is a widely used method, so both implementations >>need to be as fast as possible. It may be worthwhile making the >>PySequence_Fast() approach a special case in both routines and >>using the iterator approach as fallback if no sequence is found. > > > string_join uses PySequence_Fast already; the Unicode join didn't, and > still doesn't. In the cases of exact list or tuple arguments, > PySequence_Fast would be quicker in Unicode join. But in any cases > other than those, PySequence_Fast materializes a concrete tuple > containing the full materialized iteration, so could be more > memory-consuming. That's probably a good tradeoff, though. Indeed. I'd opt for going the PySequence_Fast() way for Unicode as well. >>Note that PyString_Join() with iterator support will also >>have to be careful about not trying to iterate twice, > > > It already is. Indeed, the primary reason it uses PySequence_Fast is > to guarantee that it never iterates over an iterator argument more > than once. The Unicode join doesn't have that potential problem. > > >>so it will have to use a similiar logic to the one applied >>in PyString_Format() where the work already done up to the >>point where it finds a Unicode string is reused when calling >>PyUnicode_Format(). > > >>>>def g(): > > ... for piece in 'a', 'b', u'c', 'd': # force Unicode promotion on 3rd yield > ... yield piece > ... > >>>>' '.join(g()) > > u'a b c d' Nice :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 27 2004) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4