[Aaron Watters] > ... > I thought it would be good to be able to do the following loop > with Numeric arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the > for loop has to continually dynamically allocate each new x Actually not, it just binds x to the sequence of PyObject*'s already in array1, one at a time. It does bump & drop the refcount on that object a lot. Also irksome is that it keeps allocating/deallocating a little integer on each trip, for the under-the-covers loop index! Marc-Andre (I think) had/has a patch to worm around that, but IIRC it didn't make much difference (wouldn't expect it to, though -- not if the loop body does any real work). One thing a smarter Python compiler could do is notice the obvious <snort>: the *internal* incref/decref operations on the object denoted by x in the loop above must cancel out, so there's no need to do any of them. "internal" == those due to the routine actions of the PVM itself, while pushing and popping the eval stack. Exploiting that is tedious; e.g., inventing a pile of opcode variants that do the same thing as today's except skip an incref here and a decref there. > and intermediate sum (and immediate deallocate them) The intermediate sum is allocated each time, but not deallocated (the pre-existing object at array2[x] *may* be deallocated, though). > and that makes the loop piteously slow. A lot of things conspire to make it slow. David is certainly right that, in this particular case, array2[array1] = array3[array1] + etc worms around the worst of them. > The idea replacing pyobject *'s with a struct [typedescr *, data *] > was a space/time tradeoff to speed up operations like the above > by eliminating any need for mallocs or other memory management.. Fleshing out details may make it look less attractive. For machines where ints are no wider than pointers, the "data *" can be replaced with the int directly and then there's real potential. If for a float the "data*" really *is* a pointer, though, what does it point *at*? Some dynamically allocated memory to hold the float appears to be the only answer, and you're right back at the problem you were hoping to avoid. Make the "data*" field big enough to hold a Python float directly, and the descriptor likely zooms to 128 bits (assuming float is IEEE double and the machine requires natural alignment). Let's say we do that. Where does the "+" implementation get the 16 bytes it needs to store its result? The space presumably already exists in the slot indexed by array2[x], but the "+" implementation has no way to *know* that. Figuring it out requires non-local analysis, which is quite a few steps beyond what Python's compiler can do today. Easiest: internal functions all grow a new PyDescriptor* argument into which they are to write their result's descriptor. The PVM passes "+" the address of the slot indexed by array2[x] if it's smart enough; or, if it's not, the address of the stack slot descriptor into which today's PVM *would* push the result. In the latter case the PVM would need to copy those 16 bytes into the slot indexed by array2[x] later. Neither of those are simple as they sound, though, at least because if array2[x] holds a descriptor with a real pointer in its variant half, the thing to which it points needs to get decref'ed iff the add succeeds. It can get very messy! > I really can't say whether it'd be worth it or not without some sort of > real testing. Just a thought. It's a good thought! Just hard to make real. but-if-michael-hudson-keeps-hacking-at-bytecodes-and-christian- keeps-trying-to-prove-he's-crazier-than-michael-by-2001- we'll-be-able-to-generate-optimized-vector-assembler-for- it<wink>-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4