Let me define two different Python interpreter systems to help the discussion. - System A: Compiled with MSVC on a 64-bit Intel chip (i.e. LLP64 data model, long is 32-bits). - System B: Compiled with gcc on a 64-bit Intel chip (i.e. LP64 data model, long is 64-bits). Same hardware. Just different compiler (and possibly different OS). First a couple of responses: [Greg Stein]: > In any case where Python needs to cast a pointer back/forth with an > "integer", there are two new routines in Python 1.5.2. From longobject.h: > > extern DL_IMPORT(PyObject *) PyLong_FromVoidPtr Py_PROTO((void *)); > extern DL_IMPORT(void *) PyLong_AsVoidPtr Py_PROTO((PyObject *)); > > I supplied the patch for these while I was also adding the 'P' format code > for the "struct" module. > > The functions return a PyIntObject or a PyLongObject depending on whether > the size of void pointer matches the size of a C long value. If a pointer > fits in a long, you get an Integer. Otherwise, you get a Long. > > > > "Python/bltinmodule.c::899": > > > > > > static PyObject * > > > builtin_id(self, args) > > > PyObject *self; > > > PyObject *args; > > > { > > > PyObject *v; > > > > > > if (!PyArg_ParseTuple(args, "O:id", &v)) > > > return NULL; > > > return PyInt_FromLong((long)v); > > > } > > > Assuming that we can say that id() is allowed to return a PyLongObject, > then this should just use PyLong_FromVoidPtr. On most platforms, it will > still return an Integer. For Win64 (and some other platforms), it will > return a Long. This means that my System A and System B (above) get different resultant object types for id() just because the compiler used for their Python interpreter uses a different data model. That sounds dangerous. Are there pickling portability issues or external interface issues? I know that noone should really need to be passing converted pointer results between platforms, but... shouldn't two Python interpreters running on identical hardware behave identically. This seems to me the only (or safest) way to guarantee portability. [Trent Mick]: > > > If so, then the representation of the Python integer type will have > > > to change (i.e. the use of 'long' cannot be relied upon). One should > > > then carry through and change (or obselete) the *_AsLong(), > > > *_FromLong() Python/C API functions to becomesomething like > > > AsLargestNativeInt(), *_FromLargestNativeInt() (or some > > > less bulky name). > > > > > > Alternatively, if the python integer type continues to use the C > > > 'long' type for 64-bit systems then the following ugly thing > > > happens: > > > - A Python integer on a 64-bit Intel chip compiled with MSVC is > > > 32-bits wide. > > > - A Python integer on a 64-bit Intel chip compiled with gcc is > > > 64-bits wide. > > > That cannot be good. [Greg Stein]: > The problem already solved (it's so much fun to borrow Guido's time > machine!). Some of the C code just needs to catch up and use the new > functionality, though. How so? Do you mean with PyLong_{As|From}VoidPtr()? I want to make a couple of suggestions about this 64-bit compatibility stuff. I will probably sound like I am on glue but please bear with me and let me try and convince you that I am not. * * * PyInt was tied to C's 'long' based on the (reasonable) assumption that this would represent the largest native integral type on which Python was running (or, at least, I think that PyInt *should* be the largest native int, otherwise it is arbitrarily limited). Hence, things like holding a pointer and printing its value came for free. However, with the LLP64 data model (that Microsoft has assumed for WIN64) this intention is bastardized: sizeof(long)==4 and size(void*)==8. Taking it as a given that Python should be made to run on the various 64-bit platforms, there are two ways to deal with this: 1. Continue to base PyInt on 'long' and bolt on things like LONG_LONG, PyLong_FromVoidPtr with return types of either PyInt or PyLong as necessary; or 2. Add a level of typedef abstraction to decouple Python from the no-longer-really-valid wish that 'long' is that largest native integer and couple PyInt with the actual largest native integral value. 3. (I know I said there were only two. Spanish Iqui...:) Andrew Kuchling has this all under control (as Tim intimated might be the case) or I am really missing something. I would like to argue for option number 2. C programmers use the various integral types for different reasons. Simple uses: Use 'int' when you just want a typical integer. Use 'long' when you need the range. Use 'short' when you know the range is limited to 64k and you need to save space. More specific: Use 'long' to store a pointer or cast to 'long' to print the decimal value of the pointer with printf(). These uses all make assumptions that can bite you when the data model (i.e. type sizes) changes. What is needed is a level of abstraction away from the fundamental C types. ANSI has defined some of this already (but maybe not enough). If you want to store a pointer, use 'intptr_t' (or 'uintptr_t'). If you know the range is limited 64k, then use 'int16_t'. If you want the largest native integral type, use something like 'intlongest_t'. If you know that range is limited to 64k, but you don't take the time hit for sign extension that 'int16_t' may imply, then use 'int16fast_t'. 'int16fast_t' and its kin (the ugly name is mine) would be guaranteed to be at least as wide as the name implies (i.e. 16-bits wide here), but could be larger if that would be faster for the current system. It is these meanings that I think C programmers are really trying to express when they use short, and int, and long. On the Python/C API side, use things like: - PyInt would be tied to intlongest_t - extern DL_IMPORT(PyObject *) PyInt_FromLongest Py_PROTO((intlongest_t)); "What?!," you say. "Trent, are you nuts? Why not just use 'int' then instead of this ugly 'int16fast_t'?" Well, just using 'int' carries the implicit assumption that 'int' is at least 16-bits wide. I know that it *is* for any reasonable system that Python is going to run on but: (1) the explicit specification of the range is self documenting as to the intentions of the author; and (2) the same argument applies to int*fast_t of other sizes where the size assumption about 'int' may not be so cut-and-dry. This opens up a can of worms. Your first impression is to raise your hands and say that everything from printf formatters, to libc functions, to external libraries, to PyArg_Parse() and Py_BuildValue() is based upon the fundamental C types. Hence, it is not possible to slip in a level of data type abstraction. I suppose I could be proven wrong, but I think it is possible. The printf formatters can be manhandled to use the formatter you want. The libc functions, on quick perusal, painfully try to do something like what I am suggesting anyway so they map fairly well. PyArg_Parse(), etc *could* be changed if that was necessary (*now* I *know* Guido thinks I am nuts). * * * This, I think, is the idea for general data model portability. However, because (1) it would require a lot of little patches and (2) it may require some backward incompatibilities, I realize that it would never be considered until at least Python 2.0. If you are skeptical because it sounds like I am just talking and asking for a volunteer to make these changes, it might help to know that I am volunteering to work on this. (Yes, Tim. ActiveState *is* paying me to look at this stuff.) I just want to see what the general reaction is to this: You are going about this in the wrong way? Go for it? Yes, but...? > or-if-activestate-solves-this-for-perl-first-we'll-just-rewrite- > python-in-that<wink>-ly y'rs - tim not-on-your-life-ly y'rs - Trent Trent trentm@ActiveState.com
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4