> Raymond Hettinger wrote: >> * It will assist pypy style projects and other python implementations >> when they have to build equivalents to CPython. >> >> * Will eliminate confusion about what functions were exactly intended to >> do. >> >> * Will confer benefits similar to test driven development where the >> documentation and pure python version are developed first and doctests >> gotten to pass, then the C version is created to match. > > I haven't seen anyone comment about this assertion of "equivalence". > Doesn't it strike you as difficult to maintain *two* versions of every > function, and ensure they match *exactly*? Glad you brought this up. My idea is to present rough equivalence in unoptimized python that is simple and clear. The goal is to provide better documentation where code is more precise than English prose. That being said, some subset of the existing tests should be runnable against the rough equivalent and the python code should incorporate doctests. Running both sets of test should suffice to maintain the rough equivalence. The notion of exact equivalence should be left to PyPy folks who can attest that the code can get convoluted when you try to simulate exactly when error checking is performed, read-only behavior for attributes, and making the stacktraces look the same when there are errors. In contrast, my goal is an approximation that is executable but highly readable and expository. My thought is to do this only with tools where it really does enhance the documentation. The exercise is worthwhile in and of itself. For example, I'm working on a pure python version of str.split() and quickly determined that the docs are *still* in error even after many revisions over the years (the whitespace version does not, in fact, start by stripping whitespace from both ends). Here's what I have so far: def split(s, sep=None, maxsplit=-1): """split(S, [sep [,maxsplit]]) -> list of strings Return a list of the words in the string S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result. >>> from itertools import product >>> s = ' 11 2 333 4 ' >>> split(s, None) ['11', '2', '333', '4'] >>> n = 8 >>> for s in product('ab ', repeat=n): ... for maxsplit in range(-2, len(s)+2): ... s = ''.join(s) ... assert s.split(None, maxsplit) == split(s, None, maxsplit), namedtuple('Err', 'str maxsplit result target')(repr(s), maxsplit, split(s,None,maxsplit), s.split(None, maxsplit)) """ result = [] spmode = True start = 0 if maxsplit != 0: for i, c in enumerate(s): if spmode: if not c.isspace(): start = i spmode = False elif c.isspace(): result.append(s[start:i]) start = i spmode = True if len(result) == maxsplit: break rest = s[start:].lstrip() return (result + [rest]) if rest else result Once I have the cleanest possible, self-explantory code that passes tests, I'll improve the variable names and make a more sensible docstring with readable examples. Surprisingly, it hasn't been a trivial exercise to come-up with an equivalent that corresponds more closely to the way we think instead of corresponding the C code -- I want to show *what* is does more than *how* it does it. Raymond
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4