There is a 2:1 speed difference between dict.has_key() and dict.__contains__(): >>> setup = """ d = dict.fromkeys(range(1000)) hk = d.has_key cn = d.__contains__ """ >>> import timeit >>> min(timeit.Timer('hk(500)', setup).repeat(5)) 0.55875391810121755 >>> min(timeit.Timer('cn(500)', setup).repeat(5)) 1.0645585745654671 The implementations are almost identical except that has_key() is implemented as a METH_O PyCFunction attached to the method table and __contains__() is a regular C function attached to the sq_contains slot which, AFAICT, gets wrapped in a PyCFunction_Type. Then both has_key() and __contains__() get bundled in a method-wrapper object. I'm at the outer limits of my understanding here, but the above indicates to me that *all* of the slots with a (PyCFunction) signature could halve their access time, if instead of being wrapped in a PyCFunction_Type, they could just be added to the method table with the appropriate name for the slot, a METH_O flag, and a generic docstring. My first thought for an implementation is to append a forwarding function, and append it to the method definition table (manually at first, but using PyType_Ready later to do it automagically): static PyObject * directcontains(PyObject *mp, PyObject *key) { register long ok = mp->ob_type->tp_as_sequence->sq_contains(mp, key); return PyInt_FromLong(ok); /* Note this really should be PyBool */ } PyDoc_STRVAR(directcontains__doc__, "builtin __contains__ method"); static PyMethodDef mapp_methods[] = { {"__contains__", (PyCFunction)directcontains, METH_O, directcontains__doc__}, . . . In theory, this makes the two methods equally fast. In practice, the new function doesn't get run when d.__contains__() is called. Somewhere, I need to tell it that __contains__ refers to the new function instead of the old wrapper. If anyone knows where or has a better idea, please let me know. Raymond Hettinger ---------------------------------------------------------------------- P.S. Here's the source for dict.has_key() and dict.__contains__(): static PyObject * dict_has_key(register dictobject *mp, PyObject *key) { long hash; register long ok; if (!PyString_CheckExact(key) || (hash = ((PyStringObject *) key)->ob_shash) == -1) { hash = PyObject_Hash(key); if (hash == -1) return NULL; } ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL; return PyBool_FromLong(ok); } static int dict_contains(dictobject *mp, PyObject *key) { long hash; if (!PyString_CheckExact(key) || (hash = ((PyStringObject *) key)->ob_shash) == -1) { hash = PyObject_Hash(key); if (hash == -1) return -1; } return (mp->ma_lookup)(mp, key, hash)->me_value != NULL; }
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4