Python currently distinguishes between two kinds of integers (ints): regular or short ints, limited by the size of a C long (typically 32 or 64 bits), and long ints, which are limited only by available memory. When operations on short ints yield results that don’t fit in a C long, they raise an error. There are some other distinctions too. This PEP proposes to do away with most of the differences in semantics, unifying the two types from the perspective of the Python user.
RationaleMany programs find a need to deal with larger numbers after the fact, and changing the algorithms later is bothersome. It can hinder performance in the normal case, when all arithmetic is performed using long ints whether or not they are needed.
Having the machine word size exposed to the language hinders portability. For examples Python source files and .pyc’s are not portable between 32-bit and 64-bit machines because of this.
There is also the general desire to hide unnecessary details from the Python user when they are irrelevant for most applications. An example is memory allocation, which is explicit in C but automatic in Python, giving us the convenience of unlimited sizes on strings, lists, etc. It makes sense to extend this convenience to numbers.
It will give new Python programmers (whether they are new to programming in general or not) one less thing to learn before they can start using the language.
ImplementationInitially, two alternative implementations were proposed (one by each author):
PyInt
type’s slot for a C long will be turned into a:
union { long i; struct { unsigned long length; digit digits[1]; } bignum; };
Only the n-1
lower bits of the long
have any meaning; the top bit is always set. This distinguishes the union
. All PyInt
functions will check this bit before deciding which types of operations to use.
OverflowError
when a result cannot be represented as a short int. A new type, integer
, may be introduced that is an abstract base type of which both the int
and long
implementation types are subclassed. This is useful so that programs can check integer-ness with a single test:
if isinstance(i, integer): ...
After some consideration, the second implementation plan was selected, since it is far easier to implement, is backwards compatible at the C API level, and in addition can be implemented partially as a transitional measure.
IncompatibilitiesThe following operations have (usually subtly) different semantics for short and for long integers, and one or the other will have to be changed somehow. This is intended to be an exhaustive list. If you know of any other operation that differ in outcome depending on whether a short or a long int with the same value is passed, please write the second author.
<<
raise OverflowError
if the result cannot be represented as a short int. This will be changed to return a long int instead. The following operators can currently raise OverflowError
: x+y
, x-y
, x*y
, x**y
, divmod(x, y)
, x/y
, x%y
, and -x
. (The last four can only overflow when the value -sys.maxint-1
is involved.)x<<n
can lose bits for short ints. This will be changed to return a long int containing all the shifted-out bits, if returning a short int would lose bits (where changing sign is considered a special case of losing bits).0xffffffff == -1
on a 32-bit machine. This will be changed to equal 0xffffffffL
(2**32-1
).%u
, %x
, %X
and %o
string formatting operators and the hex()
and oct()
built-in functions behave differently for negative numbers: negative short ints are formatted as unsigned C long, while negative long ints are formatted with a minus sign. This will be changed to use the long int semantics in all cases (but without the trailing L that currently distinguishes the output of hex()
and oct()
for long ints). Note that this means that %u
becomes an alias for %d
. It will eventually be removed.repr()
of a long int returns a string ending in L while repr()
of a short int doesn’t. The L will be dropped; but not before Python 3.0.type(x).__name__
depends on whether x is a short or a long int. Since implementation alternative 2 is chosen, this difference will remain. (In Python 3.0, we may be able to deploy a trick to hide the difference, because it is annoying to reveal the difference to user code, and more so as the difference between the two types is less visible.)marshal
module, and by the pickle
and cPickle
modules. This difference will remain (at least until Python 3.0).is
for comparisons of short ints and happens to work because of this interning. Such code may fail if used with long ints.)A trailing L at the end of an integer literal will stop having any meaning, and will be eventually become illegal. The compiler will choose the appropriate type solely based on the value. (Until Python 3.0, it will force the literal to be a long; but literals without a trailing L may also be long, if they are not representable as short ints.)
Built-in FunctionsThe function int()
will return a short or a long int depending on the argument value. In Python 3.0, the function long()
will call the function int()
; before then, it will continue to force the result to be a long int, but otherwise work the same way as int()
. The built-in name long
will remain in the language to represent the long implementation type (unless it is completely eradicated in Python 3.0), but using the int()
function is still recommended, since it will automatically return a long when needed.
The C API remains unchanged; C code will still need to be aware of the difference between short and long ints. (The Python 3.0 C API will probably be completely incompatible.)
The PyArg_Parse*()
APIs already accept long ints, as long as they are within the range representable by C ints or longs, so that functions taking C int or long argument won’t have to worry about dealing with Python longs.
There are three major phases to the transition:
OverflowError
return a long int value instead. This is the only change in this phase. Literals will still distinguish between short and long ints. The other semantic differences listed above (including the behavior of <<
) will remain. Because this phase only changes situations that currently raise OverflowError
, it is assumed that this won’t break existing code. (Code that depends on this exception would have to be too convoluted to be concerned about it.) For those concerned about extreme backwards compatibility, a command line option (or a call to the warnings module) will allow a warning or an error to be issued at this point, but this is off by default.repr()
.
hex()
and oct()
, %u
, %x
, %X
and %o
, hex
and oct
literals in the (inclusive) range [sys.maxint+1, sys.maxint*2+1]
, and left shifts losing bits.repr()
, and made illegal on input. (If possible, the long
type completely disappears.) The trailing L is also dropped from hex()
and oct()
.Phase 1 will be implemented in Python 2.2.
Phase 2 will be implemented gradually, with 2A in Python 2.3 and 2B in Python 2.4.
Phase 3 will be implemented in Python 3.0 (at least two years after Python 2.4 is released).
OverflowWarningHere are the rules that guide warnings generated in situations that currently raise OverflowError
. This applies to transition phase 1. Historical note: despite that phase 1 was completed in Python 2.2, and phase 2A in Python 2.3, nobody noticed that OverflowWarning was still generated in Python 2.3. It was finally disabled in Python 2.4. The Python builtin OverflowWarning
, and the corresponding C API PyExc_OverflowWarning
, are no longer generated or used in Python 2.4, but will remain for the (unlikely) case of user code until Python 2.5.
OverflowWarning
. This is a built-in name.OverflowWarning
warning is issued, with a message argument indicating the operation, e.g. “integer addition”. This may or may not cause a warning message to be displayed on sys.stderr
, or may cause an exception to be raised, all under control of the -W
command line and the warnings module.OverflowWarning
warning is ignored by default.OverflowWarning
warning can be controlled like all warnings, via the -W
command line option or via the warnings.filterwarnings()
call. For example:
python -Wdefault::OverflowWarning
cause the OverflowWarning
to be displayed the first time it occurs at a particular source line, and:
python -Werror::OverflowWarning
cause the OverflowWarning
to be turned into an exception whenever it happens. The following code enables the warning from inside the program:
import warnings warnings.filterwarnings("default", "", OverflowWarning)
See the python man
page for the -W
option and the warnings
module documentation for filterwarnings()
.
OverflowWarning
warning is turned into an error, OverflowError
is substituted. This is needed for backwards compatibility.x+y
) is recomputed after converting the arguments to long ints.If you pass a long int to a C function or built-in operation that takes an integer, it will be treated the same as a short int as long as the value fits (by virtue of how PyArg_ParseTuple()
is implemented). If the long value doesn’t fit, it will still raise an OverflowError
. For example:
def fact(n): if n <= 1: return 1 return n*fact(n-1) A = "ABCDEFGHIJKLMNOPQ" n = input("Gimme an int: ") print A[fact(n)%17]
For n >= 13
, this currently raises OverflowError
(unless the user enters a trailing L as part of their input), even though the calculated index would always be in range(17)
. With the new approach this code will do the right thing: the index will be calculated as a long int, but its value will be in range.
These issues, previously open, have been resolved.
hex()
and oct()
applied to longs will continue to produce a trailing L until Python 3000. The original text above wasn’t clear about this, but since it didn’t happen in Python 2.4 it was thought better to leave it alone. BDFL pronouncement here:
https://mail.python.org/pipermail/python-dev/2006-June/065918.html
sys.maxint
? Leave it in, since it is still relevant whenever the distinction between short and long ints is still relevant (e.g. when inspecting the type of a value).%u
completely? Remove it.<<
not truncating integers? Yes.The implementation work for the Python 2.x line is completed; phase 1 was released with Python 2.2, phase 2A with Python 2.3, and phase 2B will be released with Python 2.4 (and is already in CVS).
CopyrightThis document has been placed in the public domain.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4