On 2011-04-27 23:24 , Guido van Rossum wrote: > On Wed, Apr 27, 2011 at 9:15 PM, Alexander Belopolsky > <alexander.belopolsky at gmail.com> wrote: >> On Wed, Apr 27, 2011 at 2:48 PM, Robert Kern<robert.kern at gmail.com> wrote: >> .. >>> I suspect most of us would oppose changing it on general >>> backwards-compatibility grounds rather than actually *liking* the current >>> behavior. If the behavior changed with Python floats, we'd have to mull over >>> whether we try to match that behavior with our scalar types (one of which >>> subclasses from float) and our arrays. We would be either incompatible with >>> Python or C, and we'd probably end up choosing Python to diverge from. It >>> would make a mess, honestly. We already have to explain why equality is >>> funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not >>> a bool, so we can't do containment tests for lists of arrays), so NaN is >>> pretty easy to explain afterward. >> >> Most NumPy applications are actually not exposed to NaN problems >> because it is recommended that NaNs be avoided in computations and >> when missing or undefined values are necessary, the recommended >> solution is to use ma.array or masked array which is a drop-in >> replacement for numpy array type and carries a boolean "mask" value >> with every element. This allows to have undefined elements is arrays >> of any type: float, integer or even boolean. Masked values propagate >> through all computations including comparisons. > > So do new masks get created when the outcome of an elementwise > operation is a NaN? No. > Because that's the only reason why one should have > NaNs in one's data in the first place -- not to indicate missing > values! Yes. I'm not sure that Alexander was being entirely clear. Masked arrays are intended to solve just the missing data problem and not the occurrence of NaNs from computations. There is still a persistent part of the community that really does like to use NaNs for missing data, though. I don't think that's entirely relevant to this discussion[1]. I wouldn't say that numpy applications aren't exposed to NaN problems. They are just as exposed to computational NaNs as you would expect any application that does that many flops to be. [1] Okay, that's a lie. I'm sure that persistent minority would *love* to have NaN == NaN, because that would make their (ab)use of NaNs easier to work with. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4