On 2/1/2014 8:06 PM, Steven D'Aprano wrote: > Hi all, > > Over on the Python-ideas list, there's a thread about the new statistics > module, and as the author of that module, I'm looking for a bit of > guidance regarding backwards compatibility. Specifically two issues: > > > (1) With numeric code, what happens if the module become more[1] > accurate in the future? Does that count as breaking backwards > compatibility? > > E.g. Currently I use a particular algorithm for calculating variance. > Suppose that for a particular data set, this algorithm is accurate to > (say) seven decimal places: > > # Python 3.4 > variance(some_data) == 1.23456700001 > > Later, I find a better algorithm, which improves the accuracy of the > result: > > # Python 3.5 or 3.6 > variance(some_data) == 1.23456789001 > > > Would this count as breaking backwards compatibility? If so, how should > I handle this? I don't claim that the current implementation of the > statistics module is optimal, as far as precision and accuracy is > concerned. It may improve in the future. > > Or would that count as a bug-fix? "Variance function was inaccurate, now > less wrong", perhaps. That is my inclination. > I suppose the math module has the same issue, except that it just wraps > the C libraries, which are mature and stable and unlikely to change. Because C libraries differ, math results differ even in the same version, so they can certainly change (hopefully improve) in future versions. I think the better analogy is cmath, which I believe is more than just a wrapper. > The random module has a similar issue: > > http://docs.python.org/3/library/random.html#notes-on-reproducibility > > > (2) Mappings[2] are iterable. That means that functions which expect > sequences or iterators may also operate on mappings by accident. I think 'accident' is the key. (Working with sets is not an accident.) Anyone who really wants the mean of keys should be explicit: mean(d.keys()) > example, sum({1: 100, 2: 200}) returns 3. If one wanted to reserve the > opportunity to handle mappings specifically in the future, without being > locked in by backwards-compatibility, how should one handle it? > > a) document that behaviour with mappings is unsupported and may > change in the future; I think the doc should in any case specify the proper domain. In this case, I think it should exclude mappings: 'non-empty non-mapping iterable of numbers', or 'an iterable of numbers that is neither empty nor a mapping'. That makes the behavior at best undefined and subject to change. There should also be a caveat about mixing types, especially Decimals, if not one already. Perhaps rewrite the above as 'an iterable that is neither empty nor a mapping of numbers that are mutually summable'. > b) raise a warning when passed a mapping, but still iterate over it; > > c) raise an exception and refuse to iterate over the mapping; This, if possible. An empty iterable will raise at '/ 0'. Most anything that is not an iterable of number will eventually raise at '/ n' Testing both that an exception is raised and that it is one we want is why why unittest has assertRaises. > Question (2) is of course a specific example of a more general > question, to what degree is the library author responsible for keeping > backwards compatibility under circumstances which are not part of the > intended API, but just work by accident? > [1] Or, for the sake of the argument, less accurate. > > [2] And sets. -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4