[Jack Jansen] > Well, it turns out that disabling fused-add-mul indeed fixes the > problem. The CodeWarrior manual warns that results may be slightly > different with and without fused instructions, but the example they > give is with operations apparently done in higher precision with the > fused instructions. No word about nonstandard behaviour for +0.0 and > -0.0. > > As this seems to be a PowerPC issue, not a MacOS issue, it is > something that other PowerPC porters may want to look out for too > (does AIX still exist?). The PowerPC architecture's fused instructions are wonderful for experts, because in a*b+c (assuming IEEE doubles w/ 53 bits of precision) they compute the a*b part to 106 bits of precision internally, and the add of c gets to see all of them. This is great if you *know* c is pretty much the negation of the high-order 53 bits of the product, because it lets you get at the *lower* 53 bits too; e.g., hipart = a*b; lopart = a*b - hipart; /* assuming fused mul-sub is generated */ gives a pair of doubles (hipart, lopart) whose mathematical (not f.p.) sum hipart + lopart is exactly equal to the mathematical (not f.p.) product a*b. In the hands of an expert, this can, e.g., be used to write ultra-fast high-precision math libraries: it gives a very cheap way to get the effect of computing with about twice the native precision. So that's the kind of thing they're warning you about: without the fused mul-sub, "lopart" above is always computed to be exactly 0.0, and so is useless. Contrarily, some fp algorithms *depend* on cancelling out oodles of leading bits in intermediate results, and in the presence of fused mul-add deliver totally bogus results. However, screwing up 0's sign bit has nothing to do with any of that, and if the HW is producing -0 for a fused (+anything)*(+0)-(+0), it can't be called anything other than a HW bug (assuming it's not in the to-minus-infinity rounding mode). When a given compiler generates fused instructions (when available) is a x-compiler crap-shoot, and the compiler you're using *could* have generated them before with the same end result. There's really nothing portable we can do in the source code to convince a compiler never to generate them. So looks like you're stuck with a compiler switch here. not-the-outcome-i-was-hoping-for-but-i'll-take-it<wink>-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4