RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/numpy/numpy/issues/5844 below:

Decide on what the resolution rules for op/rop/__numpy_ufunc__ actually are · Issue #5844 · numpy/numpy · GitHub

There is a complex set of questions around how to handle method resolution in the presence of __numpy_ufunc__. Currently in master is an extremely complicated set of rules that isn't documented and that I don't actually understand (see #5748 for the latest set of changes to this), so it's kinda hard to know whether they are correct, but I suspect not. And this is a blocker for 1.10, b/c whatever we release in 1.10 will be set in stone forever.

I strongly feel that we cannot include __numpy_ufunc__ in a release without at least having a document somewhere describing what the actual dispatch rules are. I hope that doesn't mean we have to defer __numpy_ufunc__ for another release, but if it does then it does.

AFAICT this is how a op b dispatch works for ndarrays, BEFORE __numpy_ufunc__ (i.e., this is how 1.9 works):

First Python uses the subclass rule to decide whether to invoke a.__op__(b) or b.__rop__(a). So in the case where one of these objects is a proper subclass of the other, that object always gets to do absolutely anything, so that's fine. The interesting cases are the ones where neither is a proper subclass of the other (either because it's like, matrix + masked array, or because it's like ndarray + scipy.sparse). So without loss of generality, let's focus on the case where Python calls a.__op__(b), and a is either an instance of ndarray or else an instance of a subclass of ndarray which has not overridden __op__, i.e. we're getting ndarray.__op__(a, b).
ndarray.__op__ has the following logic (see PyArray_GenericBinaryFunction in number.c):
- If b is not an ndarray at all (even a subclass), and b has a higher __array_priority__ than a, then we return NotImplemented and let control pass to b.__rop__(a).
- Otherwise, we call np.op(a, b) and let the ufunc machinery take over.
np.op(a, b) does the following (see PyUFunc_GenericFunction, PyUFunc_GeneralizedFunction, in ufunc_object.c, and also ufunc_generic_call which converts -2 return values from the previous into NotImplemented so you have to audit their whole call stack):
- If b is not an ndarray, and calling np.array(b) returns an object array (presumably because coercion failed... though I guess this could also be hit if b.__array__() return an object array or something), AND b has a higher __array_priority__ than a, and b has an __rop__ method, then return NotImplemented.
- If any of our arrays contain structured dtypes or strings, and there are no special struct ufunc loops registered, but not if any of our arrays contain objects, then return NotImplemented. (This is buried in get_ufunc_arguments, search for return -2.)
- Otherwise we return the actual ufunc result.

Now, my suggestion is that the way we would EVENTUALLY like this to look is:

First, Python uses the subclass rule to decide whether to invoke a.__op__(b) or b.__rop__(a). As above, let's assume that it invokes ndarray.__op__(a, b).
ndarray.__op__(a, b) calls np.op(a, b) (which in turn invokes all the standard ufunc stuff, including __numpy_ufunc__ resolution).
There is no step 3.

I submit that it is obvious that IF we can make this work, then it is obviously the ideal outcome, because it is the simplest possible solution. But is it too simple? To determine this we have to answer two questions: (1) Will it adequately address all the relevant use cases? (2) Can we get there from here?

So let's compare the current rules to my dream rules.

First, we observe that everything that currently happens inside the ufunc machinery looks like it's totally wrong. The first check can only be triggered if b is a non-ndarray that has a higher __array_priority__ (among other things), but if we look above, we see that those conditions are sufficient to trigger the check in ndarray.__op__, so checking again at the ufunc level is redundant at best. And the second check is just incoherent nonsense AFAICT. The only reason to return NotImplemented is b/c you want to pass control to another __(r)op__ method, and there's no reason arrays containing structured dtypes in particular should somehow magically have different __(r)op__ methods available than other arrays. So we can just get rid of all the ufunc stuff immediately, great.

That leaves the __array_priority__ stuff. We have two problems here: we can't just drop this immediately b/c of backcompat issues, and we need to have some way to continue to support all the use cases that this currently supports. The first problem is just a matter of having a deprecation period. For the second, observe that a class which defines a __numpy_ufunc__ method gets complete control over what any ufunc call does, so it has almost as much power as a class that currently sets __array_priority__. The only additional power that __array_priority__ currently gives you is that it lets you distinguish between e.g. a call to ndarray.__add(a, b) versus a call to np.add(a, b). So the only code that really loses out from my proposed change is code which wants a + b and add(a, b) to do different things.

AFAIK in the entire history of numpy there is only one situation where this power has been used on purpose: the definition of matrix classes where a * b is matmul, but np.multiply(a, b) is elmul. And we've all agreed that such classes should be deprecated and eventually phased out (cite: PEP 465).

So, I conclude that EVENTUALLY my dream rules should work great. The only problem is that we need some temporary compromises to get us from here to there. Therefore, I propose we use the following dispatch rules in numpy 1.10, with the goal of moving to my "dream rules" in some future version:

First, Python uses the subclass rule to decide whether to invoke a.__op__(b) or b.__rop__(a). As above, let's assume that it invokes ndarray.__op__(a, b).
ndarray.__op__(a, b) does the following:
- If b does not define __numpy_ufunc__ and is not an ndarray at all (even a subclass), and b has a higher __array_priority__ than a, then we issue a deprecation warning and return NotImplemented and let control pass to b.__rop__(a). (bolded parts are changes compared to the current behaviour)
- If __op__ is __mul__ and b->tp_class->tp_name.startswith("scipy.sparse."), then return NotImplemented. (This rule is necessary in addition to the above, because scipy.sparse has already made a release containing __numpy_ufunc__ methods, so the exception above doesn't apply.)
- Otherwise, we call np.op(a, b) and let the ufunc machinery take over.

I believe that this is adequate to covers all practical use cases for the current dispatch machinery, and gives us a clean path to better dispatch machinery in the future.

The main alternative proposal is Pauli's, which involves a very complicated check (I won't try to summarize here, see this comment and following code). The goal of that approach is to continue supporting classes where a + b and add(a, b) do different things. I don't think that keeping substantial additional complexity around indefinitely is worth it in order to support functionality that no-one has ever found a use for except in one very specific case (overriding __mul__), and where we generally agree that that one specific case should be phased out as possible.

I would very much appreciate feedback from scipy.sparse and astropy in particular on whether the above covers all their concerns.

(Partial) History: #4815, #5748
CC: @pv, @cowlicks, @mhvk

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/numpy/numpy/issues/5844 below:

Decide on what the resolution rules for __op__/__rop__/__numpy_ufunc__ actually are · Issue #5844 · numpy/numpy · GitHub

Decide on what the resolution rules for op/rop/__numpy_ufunc__ actually are · Issue #5844 · numpy/numpy · GitHub