On Friday 07 November 2003 22:03, Fred L. Drake, Jr. wrote: > Alex Martelli writes: > > Very interesting! To me, this suggests fixing this performance bug -- > > there is no reason that I can see why the .is* methiods should be > > _slower_. Would a performance bugfix (no implementation change, just > > a speedup) be OK for 2.3.3, I hope? That would motivate me to work on > > it soonest... > > People keep hinting that these methods should be faster, but I see no > reason to think they would be. Think about it: using the method > requires the creation of a bound method object. No matter how fast > PyMalloc is, that's still a fair bit of work. Good point! So, a first little trick to accelerate this might be to use getsets (unfortunately this gives a marginally Python-level-observable alteration for e.g. "print 'x'.isdigit.__name__", so perhaps it's only suitable for 2.4, not 2.3.3, alas... I dunno...). I tried a little experiment adding a new test .isabit() that says if a string is entirely made up of '0' and '1': static PyGetSetDef string_getsets[] = { {"isabit", (getter)string_isabit, 0, 0}, {0} }; ... string_getsets, /* tp_getset */ where: static PyObject * _return_true = 0; static PyObject * _return_false = 0; static PyObject * _true_returner(PyObject* ignore_self) { Py_RETURN_TRUE; } static PyObject * _false_returner(PyObject* ignore_self) { Py_RETURN_FALSE; } static PyMethodDef _str_bool_returners[] = { {"_str_return_false", (PyCFunction)_false_returner, METH_NOARGS}, {"_str_return_true", (PyCFunction)_true_returner, METH_NOARGS}, {0} }; static PyObject * string_isabit(PyStringObject *s) { char* p = PyString_AS_STRING(s); int len = PyString_GET_SIZE(s); int i; for(i=0; i<len; ++i) { if(p[i]!='0' && p[i]!='1') { if(!_return_false) { _return_false = PyCFunction_New( _str_bool_returners+0, 0); } Py_INCREF(_return_false); return _return_false; } } if(!_return_true) { _return_true = PyCFunction_New( _str_bool_returners+1, 0); } Py_INCREF(_return_true); return _return_true; } i.e., exploit the peculiarity of strings' .is...() methods -- called on immutable objects, w/o args, so at construction time they might almost as well be replaced by the C-coded equivalent of "lambda: return False" or "lambda: return True". Of course, we'd still have to supply str.is... unbound methods (the tp_getset isn't looked at for class-level access, right...?) for compatibility with idioms such as filter(str.isdigit, words). The performance does get some increase this way, though it does not become quite as good as an 'in' test yet -- about, I'd say, in-between...: [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"0".isdigit()' 1000000 loops, best of 3: 0.52 usec per loop [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"0".isabit()' 1000000 loops, best of 3: 0.39 usec per loop [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"0" in "01"' 1000000 loops, best of 3: 0.25 usec per loop and about the same for failed tests: [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"z" in "01"' 1000000 loops, best of 3: 0.25 usec per loop [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"z".isabit()' 1000000 loops, best of 3: 0.39 usec per loop [alex at lancelot src]$ ./python ~/bin/timeit.py -c '"z".isdigit()' 1000000 loops, best of 3: 0.55 usec per loop Even though to fix the 'x'.is....__name__ issue we'd have to keep several PyCFunctions corresponding to _true_returner and _false_returner w/different names and docs, maybe this is still worth doing for 2.3.something, not just for 2.4... opinions? Alex
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4