[Trent] > > > Changes the 'b', 'h', and 'i' formatters in PyArg_ParseTuple to raise an > > > Overflow exception if they overflow (previously they just silently > > > overflowed). [Guido] > > There's one issue with this: I believe the 'b' format is mostly used > > with unsigned character arguments in practice. > > However on systems > > with default signed characters, CHAR_MAX is 127 and values 128-255 are > > rejected. I'll change the overflow test to: > > > > else if (ival > CHAR_MAX && ival >= 256) { > > > > if that's okay with you. [Trent] > Okay, I guess. Two things: > > 1. In a way this defeats the main purpose of the checks. Now a silent overflow > could happen for a signed byte value over CHAR_MAX. The only way to > automatically do the bounds checking is if the exact type is known, i.e. > different formatters for signed and unsigned integral values. I don't know if > this is desired (is it?). The obvious choice of 'u' prefixes to specify > unsigned is obviously not an option. The struct module uses upper case for unsigned. I think this is overkill here, and would add a lot of code (if applied systematically) that would rarely be used. > Another option might be to document 'b' as for unsigned chars and 'h', 'i', > 'l' as signed integral values and then set the bounds checks ([0, UCHAR_MAX] > for 'b') appropriately. Can we clamp these formatters so? I.e. we would be > limiting the user to unsigned or signed depending on the formatter. (Which > again, means that it would be nice to have different formatters for signed > and unsigned.) I think that the bounds checking is false security unless > these restrictions are made. I like this: 'b' is unsigned, the others are signed. > 2. The above aside, I would be more inclined to change the line in question to: > > else if (ival > UCHAR_MAX) { > > as this is more explicit about what is being done. Agreed. > > Another issue however is that there are probably cases where an 'i' > > format is used (which can't overflow on 32-bit architectures) but > > where the int value is then copied into a short field without an > > additional check... I'm not sure how to fix this except by a complete > > inspection of all code... Not clear if it's worth it. > > Yes, a complete code inspection seems to be the only way. That is some of > what I am doing. Again, I have two questions: > > 1. There are a fairly large number of downcasting cases in the Python code > (not necessarily tied to PyArg_ParseTuple results). I was wondering if you > think a generalized check on each such downcast would be advisable. This > would take the form of some macro that would do a bounds check before doing > the cast. For example (a common one is the cast of strlen's size_t return > value to int, because Python strings use int for their length, this is a > downcast on 64-bit systems): > > size_t len = strlen(s); > obj = PyString_FromStringAndSize(s, len); > > would become > > size_t len = strlen(s); > obj = PyString_FromStringAndSize(s, CAST_TO_INT(len)); > > CAST_TO_INT would ensure that 'len'did not overflow and would raise an > exception otherwise. > > Pros: > > - should never have to worry about overflows again > - easy to find (given MSC warnings) and easy to code in (staightforward) > > Cons: > > - more code, more time to execute > - looks ugly > - have to check PyErr_Occurred every time a cast is done How would the CAST_TO_INT macro signal an erro? C doesn't have exceptions. If we have to add checks, I'd prefer to write size_t len = strlen(s); if (INT_OVERFLOW(len)) return NULL; /* Or whatever is appropriate in this context */ obj = PyString_FromStringAndSize(s, len); > I would like other people's opinion on this kind of change. There are three > possible answers: > > +1 this is a bad change idea because...<reason> > -1 this is a good idea, go for it > +0 (mostly likely) This is probably a good idea for some case where the > overflow *could* happen, however the strlen example that you gave is > *not* such a situation. As Tim Peters said: 2GB limit on string lengths > is a good assumption/limitation. -0 > 2. Microsofts compiler gives good warnings for casts where information loss > is possible. However, I cannot find a way to get similar warnings from gcc. > Does anyone know if that is possible. I.e. > > int i = 123456; > short s = i; // should warn about possible loss of information > > should give a compiler warning. Beats me :-( --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4