[Jack, skip to the end please] [David Abrahams, on Include/objimpl.h:275: double dummy; /* force worst-case alignment */ ] > As I read the code, it affects all types (doesn't this header begin every > object, regardless of its GC flags?) Nope, only objects that go through _PyObject_GC_Malloc(). It could be a nightmare if, e.g., every string and int object consumed another (at least) 12 bytes. > and I think that's a very unhappy circumstance for your numeric > community. Remember, the type that raised the alarm here was just a > long double. The *Python* numeric community is far more likely to embed a float than a long double, and in any case seems unlikely to build a container type mixing long double with PyObject* members (i.e., one that ought to participate in cyclic gc). I expect we have a blind spot towards long double in general since Python doesn't expose or use such a thing, all the developers run on platforms where (as far as they know <wink>) it's the same as a double, and "long double" was introduced after K&R (so some old-timers likely aren't even aware C89 introduced it). But I'll change the code here to use long double instead -- it's harmless, as it doesn't make a lick of difference on any platform that matters <0.7 wink>. >> Only the objimpl.h trick might benefit from maximal alignment. > I'm not actually after maximal alignment; I look for a minimally- > sized/aligned type whose alignment is a multiple of the target > type's alignment. In any case, I was just using the assumption that > double was maximally aligned since I was linking with Python code > and the EDG front-end was too slow to handle the metaprogram -- I > figured that if the assumption was good enough for Python Well, nobody has complained yet, but the core never needs alignment stricter than double, and-- as above --an extension type that both did and needed to participate in GC is unlikey. > and my clients were depending on it anyway, it was good enough for > my code (not!). One of the secrets to Python's success is that we tell unreasonable users to go away and bother the C++ committee instead. [128-byte alignment needed for KSR's _subpage type] > I was aware that this was a theoretical possibility, but not that it > was a practical one. What's KSR? Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's) employer before Dragon. The address space was carved into 128-byte "subpages", and the hardware supported Python-style (non-owned non-reentrant) locks directly on a per-subpage basis (Python's lock.acquire() and lock.release() were one machine instruction each!). Subpages were also the unit for cache coherency across processors. So use of _subpage in our system code, and in speed-obsessed app code, was ubiquitous. I guess the main thing KSR proved was that you can't stay in business designing custom hardware to execute Python's semantics directly <wink>. > ... > Seriously, though, I think it would be reasonable to stick to aligning > the standard builtin types, in which can you can do the test without > calling malloc, FWIW. I checked this in: long double dummy; /* force worst-case alignment */ [Guido, on #ifdef USE_CACHE_ALIGNED long aligner; #endif ] > The malloc 8-byte align argument doesn't apply, since this struct is > used in an array. I was composing email while asleep <wink>. Gotcha. > ... > This was added by Jack Jansen ages ago -- I think he did measure a > speedup on an old Mac compiler, or he wouldn't have added it, and I > bet there was a #define USE_CACHE_ALIGNED in his config.h then. > > But that's all history; I agree it should be deleted. Jack, do you still want this? fighting-code-rot-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4