Hello Tim, Let me quote from stringobject.c the "heartbreaking code such as string_repeat's": nbytes = size * sizeof(char); if (nbytes / sizeof(char) != (size_t)size || nbytes + sizeof(PyStringObject) <= nbytes) <OverflowError> op = (PyStringObject *) PyObject_MALLOC(sizeof(PyStringObject) + nbytes); if (op == NULL) <MemoryError> With the proper macro, the whole code above can be written as: nbytes = Py_MUL_ADD(size, sizeof(char), sizeof(PyStringObject)); op = (PyStringObject *) PyObject_MALLOC(nbytes); if (op == NULL) <MemoryError> which seems much cleaner to me. For this reason I do not believe it is a good idea here to define a macro a la Zope C: DO_MULADD_OR(nbytes, size, sizeof(char), sizeof(PyStringObject), goto Overflow); for two reasons: there is no real need for a special "Overflow" case, as long as the returned 'nbytes' is guaranteed to be un-malloc-able; besides, no existing macro (e.g. _PyObject_VAR_SIZE) can support an extra "goto Overflow" case without breaking C code that depends on it. There is however a difference between the two above code fragments, in that the former will raise an OverflowError for "really too large values" and a MemoryError for "reasonable value but still too large for me to allocate right now". Whether this is a correct behavior is debatable. I would argue that raising MemoryError in both cases is fine: after all, "123"*9999999999999999 could be perfectly fine in another Python implementation with lazy strings. It is thus only a memory limit that makes this fail in the current implementation. Note that if we define MAX_SIZE to ((size_t)-1), the overflow check in the present stringobject.c could be more efficiently written down as: if (size > (MAX_SIZE - sizeof(PyStringObject)) / sizeof(char)) <OverflowError> where everything at the right of '>' reduces to a compile-time constant. The Py_MUL_ADD macro can be implemented with this idea in mind, which works best if the 2nd and 3rd arguments are constants. > note that anything/non_constant can be incredibly expensive! Yes. Couldn't a better overflow-checking algorithm be designed for these cases? For example, to multiply the non-constants 'a' and 'b', we can be sure that there is no overflow if both 'a' and 'b' are small enough (which is the common case anyway). I believe I can even think about a single Py_MUL_ADD macro that is as efficient as above with constant arguments, and otherwise uses this trick. About using 64-bit computations everywhere (as suggested by Christian): if the length of objects (e.g. lists) is still constrained to 31 bits, then it's fine; otherwise, the problem is the same. Do all supported 64-bit platforms have a 32-bit 'int', and thus a 31-bit limit on object lengths? By the way, this makes me think of a good efficient overflow-detection method for platforms with a "long long" type longer than size_t: unsigned long long bytes = ((unsigned long long)a)*((unsigned long long)b)+c; if (bytes > ((size_t)-1)) <overflow>; This compiles to very good assembly code on x86. Mmmh, in summary I would recommend that we define a common "multiply-and-add" macro, returning a size_t or OVERFLOW_VALUE in case of overflow, which is implemented in several different ways on different platforms, as selected by #if's. The same macro is used for just "multiply" or just "add" by relying on the compiler's strengh reduction. Armin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4