Armin Rigo writes: > Is it really what we want? It seems to me that the precision > of a float gives > roughly the magnitude of "uncertainty" of the value that it > represents. If a > float is large enough, then the same float can represent > several long integer > values. It should then probably be considered equal to any > of these integers, > instead of just equal to the one arbitrarily returned by > casting it to long. I think that what you describe is NOT the behavior that we want, and I'll try to give an explanation of why. But please bear with me... I'm still a beginner at understanding some of the theory behind numerical computing, so I may make some mistakes. Hopefully those who understand it better will correct me. It's a fact that only certain numbers can be represented exactly when using numbers of finite precision. For instance, floating point numbers where the exponent is larger than the number of digits of precision can represent certain integers but not others. The same situation arises in the use of floating (or fixed) point decimals -- only certain fractions can be expressed exactly. There are (at least) two different mental models we can use to deal with this limitation of finite precision. One (which you are espousing above) is to treat each limited-precision number as if it had an inherent uncertainty, and it represented all those numbers which are closer to that representable number than any other. The other mental model is to suppose that each limited-precision number represents just one specific value and that the other values simply cannot be represented. I always understand things better when I have concrete examples, so let me use numbers in base 10 with a precision of 3 to illustrate these two viewpoints. With a precision of 3 we can write out the values 2.34x10^4 for the number 23,400 and 2.35x10^4 for the number 23,500. But 23,401, 23,402, and all the others in between can't be represented. So by the "Error bar" mental model, we think of "2.34x10^4" as shorthand for "some number from 23,350 to 23,450". By the "Exact values" mental model we think of "2.34x10^4" as meaning "The exact value 23,340". Now that we know there are two different mental models, we have to wonder whether one is inherently better than the other (and whether it makes any difference). I maintain that it DOES make a difference, and that the "Exact values" mental model is BETTER than the "Error bar" model. The reason why it is better doesn't come into play until you start performing operations on the numbers, so let's examine that. Suppose that you are adding two numbers under the "Exact values" mental model. Adding 1.23x10^4 and 3.55x10^4 is quite easy... the exact sum of 12,300 and 35,500 is 47,800 or 4.78x10^4. But if we add 1.23x10^4 and 8.55x10^2, then the "real" sum is 12,300 + 855 = 13,155. That CAN'T be represented in the "exact values" mental model, so we have to ROUND IT OFF and store "1.32x10^4" instead. In general, the values are quite clear in the "Exact values" model, but OPERATIONS may require rounding to express the result. Proponents of the "Error bar" mental model are now thinking "yes, well our model trades complexity of VALUES for simplicity in OPERATIONS". But unfortunately that's just not true. Again, let us try to add 1.23x10^4 and 3.55x10^4. What that REALLY means (in the "Error bar" mental model) is that we are adding SOME number in the range 12,250..12,350 to SOME number in the range 35,450..35,550. The result must be some number in the range 47,700..47,900. That is LIKELY to belong to the range 47,750..47,850 (ie 4.78x10^4), but it MIGHT also come out to something in the 4.77x10^4 range or the 4.79x10^4 range. We intent to return a result of 1.78x10^4, so operations in the "Error bar" mental model aren't simple, but contain some sort of "rounding off" as well, only the exact rules for how THIS "rounding off" works are a bit more confusing -- I'm not sure I could express them very well. (Side Note: You might think to fix this problem by saying that instead of perfect ranges, the numbers represent probability curves centered around the given number. By using gaussian curves, you might even be able to make addition "simple" (at the cost of a very complex idea of "values"). But whatever distribution worked for addition would NOT work for multiplication or other operations, so this trick is doomed to failure.) So it turns out that WHATEVER you do, operations on finite- precision numbers are going to require some sort of a "rounding off" step to keep the results within the domain. That being said, the "Exact values" mental model is superior. Not only does it allow a very simple interpretation of the "values", but it *also* allows the definition of "rounding off" to be part of the definition of the operations and not be linked to the values themselves. (For instance, I prefer a simple "round-to-nearest" rule, but someone else may have a different rule and that only affects how they perform operations.) It is for this reason that (I think) most serious specifications for the behavior of finite-precision numbers prefer to use the "Exact value" mental model. (Another side note: You will notice that the argument here is strikingly similar to the behavior of PEP 327's Decimal type. That's because I learned just about everything I'm saying here from following the discussions (and doing the recommended reading) about Facundo's work there.) Okay, in my long-winded fashion, I may have shown that the "Exact values" mental model is superior to the "Error bars" model. But does it really make any difference to anyone other than writers of numerical arithmetic specifications? It does, and one of the best examples is the one that started this conversation... comparing instances of different numerical types. I maintain that when comparing a long with a float where the exponent is larger than the precision, that the float should be treated as if it were EXACTLY EQUAL TO <coefficient>*2^<exponent>, instead of trying to treat it as some sort of a range. Similarly, if we implemented a Rational type, I would suggest that instances of float (or of Facundo's Decimal) where <exponent> is LESS than the digits of <coefficient> should be treated as if they were EXACTLY EQUAL TO the corresponding fraction-over-a-power-of-two. All right, those who ACTUALLY understand this stuff (rather than having learned it from this newsgroup) can go ahead and correct me now. Once-a-teacher,-always-trying-to-give-a-lecture-lly yours, -- Michael Chermside This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4