I'm not looking for point-by-point answers here, I'm just pointing out things that were hard to follow so that they may get addressed in a revision. [Guido] > Inspired by talks by Jeremy and Skip on DevDay, here's a different > idea for speeding up access to globals. It retain semantics but (like > Jeremy's proposal) changes the type of a module's __dict__. > > - Let a cell be a really simple PyObject, containing a PyObject > pointer and a cell pointer. Meaning a pointer to a cell, I bet. Note that in the pseduo-code at the end, the cellptr member of cell objects is never referenced, so it's hard to be sure. > Both pointers may be NULL. (It may have to be called PyGlobalCell > since I believe there's already a PyCell object.) There is a PyCellObject already. > (Maybe it doesn't even have to be an object -- it could just be a tiny > struct.) Would probably make it much harder to use the existing dict code (which maps PyObjects to PyObjects). > - Let a celldict be a mapping that is implemented using a dict of > cells. Presumably this is a mapping *to* cells, and from ...? String objects? > When you use its getitem method, the PyObject * in the cell is > dereferenced, and if a NULL is found, getitem raises KeyError > even if the cell exists. Had a hard time with this: 1. Letting p be "the PyObject* in the cell", are you saying p==NULL or *p==NULL is the KeyError trigger? "dereference" suggests the latter, but the former seems to make more sense. 2. Presumably the first "the cell" in this sentence refers to a different cell than the second "the cell" intends. > Using setitem to add a new value creates a new cell and stores the > value there; Presumably in the PyObject* member of the new cell. To what is the cellptr member of the new cell set? I think NULL. > using setitem to change the value for an existing key stores the > value in the existing cell for that key. There's a separate API to > access the cells. delitem is missing, but presumably straightforward. > - We change the module implementation to use a celldict for its > __dict__. The module's getattr and setattr operations now map to > getitem and setitem on the celldict. I think the type of > <module>.__dict__ and globals() is the only backwards > incompatibility. > > - When a module is initialized, it gets its __builtins__ from the > __builtin__ module, which is itself a celldict. Surely the __builtin__ module isn't a celldict, but rather has a __dict__ that is a celldict. > For each cell in __builtins__, the new module's __dict__ adds a cell > with a NULL PyObject pointer, whose cell pointer points to the > corresponding cell of __builtins__. > > - The compiler generates LOAD_GLOBAL_CELL <i> (and STORE_GLOBAL_CELL > <i> etc.) opcodes for references to globals, where <i> is a small > index with meaning only within one code object like the const index > in LOAD_CONST. The code object has a new tuple, co_globals, giving > the names of the globals referenced by the code indexed by <i>. I > think no new analysis is required to be able to do this. Me too. > - When a function object is created from a code object and a celldict, > the function object creates an array of cell pointers by asking the > celldict for cells corresponding to the names in the code object's > co_globals. If the celldict doesn't already have a cell for a > particular name, it creates and an empty one. This array of cell > pointers is stored on the function object as func_cells. I expect that the more we use these guys (cells), the more valuable to make them PyObjects in their own right (for uniformity, ease of introspection, etc). > When a function object is created from a regular dict instead of a > celldict, func_cells is a NULL pointer. This part is regrettable, since it's Yet Another NULL check at the *top* of code using this stuff (meaning it slows the normal case, assuming that it's unusual not to get a celldict). I'm not clear on how code ends up getting created from a regular dict instead of a celldict -- is this because of stuff like "exec whatever in mydict"? > - When the VM executes a LOAD_GLOBAL_CELL <i> instruction, it gets > cell number <i> from func_cells. It then looks in the cell's > PyObject pointer, and if not NULL, that's the global value. If it > is NULL, it follows the cell's cell pointer to the next cell, if it > is not NULL, and looks in the PyObject pointer in that cell. If > that's also NULL, or if there is no second cell, NameError is > raised. (It could follow the chain of cell pointers until a NULL > cell pointer is found; but I have no use for this.) Similar for > STORE_GLOBAL_CELL <i>, except it doesn't follow the cell pointer > chain -- it always stores in the first cell. If I'm reading this right, then in the normal case of resolving "len" in def mylen(s): return len(s) 1. We test func_cells for NULL and find out it isn't. 2. A pointer to a cell object is read out of func_cells at a fixed (wrt this function) offset. This points to len's cell object in the module's celldict. 3. The cell object's PyObject* pointer is tested and found to be NULL. 4. The cell object's cellptr pointer is tested and found not to be NULL. This points to len's cell object in __builtin__'s celldict. 5. The cell object's cellptr's PyObject* is tested and found not to be NULL. 6. The cell object's cellptr's PyObject* is returned. > - There are fallbacks in the VM for the case where the function's > globals aren't a celldict, and hence func_cells is NULL. In that > case, the code object's co_globals is indexed with <i> to find the > name of the corresponding global and this name is used to index the > function's globals dict. Which may not succeed, so we also need another level to back off to the builtins. I'd like to pursue getting rid of the func_cells==NULL special case, even if it means constructing a celldict out of a regular dict for the duration, and feeding mutations back in to the regular dict afterwards. > I believe that's it. I think this faithfully implements the current > semantics (where a global can shadow a builtin), but without the need > for any dict lookups when accessing globals, except in cases where an > explicit dict is passed to exec or eval(). I think <wink> I agree. Note that a chain of 4 test+branches against NULL in "the usual case" for builtins may not be faster on average than inlining the first few useful lines of lookdict_string twice (the expected path in this routine became fat-free for 2.2): i = hash; ep = &ep0[i]; if (ep->me_key == NULL || ep->me_key == key) return ep; Win or lose, that's usually the end of a dict lookup. That is, I'm certain we're paying significantly more for layers of C-level function call overhead today than for what the dict implementation actually does now (in the usual cases). > Compare this to Jeremy's scheme using dlicts: > > http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals > > - My approach doesn't require global agreement on the numbering of the > globals; each code object has its own numbering. This avoids the > need for more global analysis, Don't really care about that. > and allows adding code to a module using exec that introduces new > globals without having to fall back on a less efficient scheme. That is indeed lovely. > - Jeremy's approach might be a teensy bit faster because it may have > to do less work in the LOAD_GLOBAL; but I'm not convinced. LOAD_GLOBAL is executed much more often than STORE_GLOBAL, so whichever scheme wins for LOAD_GLOBAL will enjoy a multiplier effect when measuring overall performance. [and skipping the code]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4