[Tim] >> If they're surprised by this, they indeed don't understand the >> arithmetic at all! This is an argument for using a different form of >> arithmetic, not for lying about reality. > This is not lying! Yes, I overstated that. It's not lying, but I defy anyone to explain the full truth of it in a way even Guido could understand <0.9 wink>. "Shortest conversion" is a subtle concept, requiring knowledge not only of the mathematical value, but of details of the HW representation. Plain old "correct rounding" is HW-independent, so is much easier to *fully* understand. And in things floating-point, what you don't fully understand will eventually burn you. Note that in a machine with 2-bit floating point, the "shortest conversion" for 0.75 is the string "0.8": this should suggest the sense in which "shortest conversion" can be actively misleading too. > If you type in "3.1416" and Python says "3.1416", then indeed it is the > case that "3.1416" is a correct way to type in the floating-point number > being expressed. So "3.1415999999999999" is not any more truthful than > "3.1416" -- it's just more annoying. Yes, shortest conversion is *defensible*. But Python has no code to implement that now, so it's not an option today. > I just tried this in Python 1.5.2+: > > >>> .1 > 0.10000000000000001 > >>> .2 > 0.20000000000000001 > >>> .3 > 0.29999999999999999 > >>> .4 > 0.40000000000000002 > >>> .5 > 0.5 > >>> .6 > 0.59999999999999998 > >>> .7 > 0.69999999999999996 > >>> .8 > 0.80000000000000004 > >>> .9 > 0.90000000000000002 > > Ouch. As shown in my reply to Christian, shortest conversion is not a cure for this "gosh, it printed so much more than I expected it to"; it only appears to "fix it" in the simplest examples. So long as you want eval(what's_diplayed) == what's_typed, this is unavoidable. The only ways to avoid that are to use a different arithmetic, or stop using repr() at the prompt. >> As above. repr() shouldn't be used at the interactive prompt >> anyway (but note that I did not say str() should be). > What, then? Introduce a third conversion routine and further > complicate the issue? I don't see why it's necessary. Because I almost never want current repr() or str() at the prompt, and even you <wink> don't want 3.1416-3.141 to display 0.0005999999999999339 (which is the least you can print and have eval return the true answer). >>> What should really happen is that floats intelligently print in >>> the shortest and simplest manner possible >> This can be done, but only if Python does all fp I/O conversions >> entirely on its own -- 754-conforming libc routines are inadequate >> for this purpose > Not "all fp I/O conversions", right? Only repr(float) needs to > be implemented for this particular purpose. Other conversions > like "%f" and "%g" can be left to libc, as they are now. No, all, else you risk %f and %g producing results that are inconsistent with repr(), which creates yet another set of incomprehensible surprises. This is not an area that rewards half-assed hacks! I'm intimately familiar with just about every half-assed hack that's been tried here over the last 20 years -- they never work in the end. The only approach that ever bore fruit was 754's "there is *a* mathematically correct answer, and *that's* the one you return". Unfortunately, they dropped the ball here on float<->string conversions (and very publicly regret that today). > I suppose for convenience's sake it may be nice to add another > format spec so that one can ask for this behaviour from the "%" > operator as well, but that's a separate issue (perhaps "%r" to > insert the repr() of an argument of any type?). %r is cool! I like that. >>> def smartrepr(x): >>> p = 17 >>> while eval('%%.%df' % (p - 1) % x) == x: p = p - 1 >>> return '%%.%df' % p % x >> This merely exposes accidents in the libc on the specific >> platform you run it. That is, after >> >> print smartrepr(x) >> >> on IEEE-754 platform A, reading that back in on IEEE-754 ?> platform B may not yield the same number platform A started with. > That is not repr()'s job. Once again: > > repr() is not for the machine. And once again, I didn't and don't agree with that, and, to save the next seven msgs, never will <wink>. > It is not part of repr()'s contract to ensure the kind of > platform-independent conversion you're talking about. It > prints out the number in a way that upholds the eval(repr(x)) == x > contract for the system you are currently interacting with, and > that's good enough. It's not good enough for Java and Scheme, and *shouldn't* be good enough for Python. The 1.6 repr(float) is already platform-independent across IEEE-754 machines (it's not correctly rounded on most platforms, but *does* print enough that 754 guarantees bit-for-bit reproducibility) -- and virtually all Python platforms are IEEE-754 (I don't know of an exception -- perhaps Python is running on some ancient VAX?). The std has been around for 15+ years, virtually all platforms support it fully now, and it's about time languages caught up. BTW, the 1.5.2 text-mode pickle was *not* sufficient for reproducing floats either, even on a single machine. It is now -- but thanks to the change in repr. > If you wanted platform-independent serialization, you would > use something else. There is nothing else. In 1.5.2 and before, people mucked around with binary dumps hoping they didn't screw up endianness. > As long as the language reference says > > "These represent machine-level double precision floating > point numbers. You are at the mercy of the underlying > machine architecture and C implementation for the accepted > range and handling of overflow." > > and until Python specifies the exact sizes and behaviours of > its floating-point numbers, you can't expect these kinds of > cross-platform guarantees anyway. There's nothing wrong with exceeding expectations <wink>. Despite what the reference manual says, virtually all machines use identical fp representations today (this wasn't true when the text above was written). > str()'s contract: > - if x is a string, str(x) == x > - otherwise, str(x) is a reasonable string coercion from x The last is so vague as to say nothing. My counterpart-- at least equally vague --is - otherwise, str(x) is a string that's easy to read and contains a compact summary indicating x's nature and value in general terms > repr()'s contract: > - if repr(x) is syntactically valid, eval(repr(x)) == x > - repr(x) displays x in a safe and readable way I would say instead: - every character c in repr(x) has ord(c) in range(32, 128) - repr(x) should strive to be easily readable by humans > - for objects composed of basic types, repr(x) reflects > what the user would have to say to produce x Given your first point, does this say something other than "for basic types, repr(x) is syntactically valid"? Also unclear what "basic types" means. > pickle's contract: > - pickle.dumps(x) is a platform-independent serialization > of the value and state of object x Since pickle can't handle all objects, this exaggerates the difference between it and repr. Give a fuller description, like - If pickle.dumps(x) is defined, pickle.loads(pickle.dumps(x)) == x and it's the same as the first line of your repr() contract, modulo s/syntactically valid/is defined/ s/eval/pickle.loads/ s/repr/pickle.dumps/ The differences among all these guys remain fuzzy to me. but-not-surprising-when-talking-about-what-people-like-to-look-at-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4