On 7/3/2018 5:37 PM, Serhiy Storchaka wrote: > I like programming languages in which all are expressions (including > function declarations, branching and loops) and you can use an > assignment at any point, but Python is built on other ways, and I like > Python too. PEP 572 looks violating several Python design principles. > Python looks simple language, and this is its strong side. I believe > most Python users are not professional programmers -- they are > sysadmins, scientists, hobbyists and kids -- but Python is suitable for > them because its clear syntax and encouraging good style of programming. > In particularly mutating and non-mutating operations are separated. The > assignment expression breaks this. There should be very good reasons for > doing this. But it looks to me that all examples for PEP 572 can be > written better without using the walrus operator. I appreciate you showing alternatives I can use now. Even once implemented, one can not use A.E's until one no longer cares about 3.7 compatibility. Then there will still be a choice. >> results = [(x, y, x/y) for x in input_data if (y := f(x)) > 0] > > results = [(x, y, x/y) for x in input_data for y in [f(x)] if y > 0] Would (f(x),) be faster? import timeit as ti print(ti.timeit('for y in {x}: pass', 'x=1')) print(ti.timeit('for y in [x]: pass', 'x=1')) print(ti.timeit('for y in (x,): pass', 'x=1')) # prints 0.13765254499999996 # seconds per 1_000_000 = microseconds each. 0.10321274000000003 0.09492473300000004 Yes, but not enough to pay for adding ',', and sometimes forgetting. >> stuff = [[y := f(x), x/y] for x in range(5)] > stuff = [[y, x/y] for x in range(5) for y in [f(x)]] Creating an leaky name binding appears to about 5 x faster than iterating a temporary singleton. print(ti.timeit('y=x', 'x=1')) print(ti.timeit('y=x; del y', 'x=1')) # 0.017357778999999907 0.021115051000000107 If one adds 'del y' to make the two equivalent, the chars typed is about the same. To me, the choice amounts to subject reference. Even with y:=x available, I would write the expansion as res = [] for x in range(5): y = f(x) res.append((y, x/y)) rather than use the assignment expression in the tuple. It creates a 'hitch' in thought. > This idiom looks unusual for you? But this is a legal Python syntax, and > it is not more unusual than the new walrus operator. This idiom is not > commonly used because there is very little need of using above examples > in real code. And I'm sure that the walrus operator in comprehension > will be very rare unless PEP 572 will encourage writing complicated > comprehensions. Most users prefer to write an explicit loop. > I want to remember that PEP 572 started from the discussion on > Python-ideas which proposed a syntax for writing the following code as a > comprehension: > > smooth_signal = [] > average = initial_value > for xt in signal: > average = (1-decay)*average + decay*xt > smooth_signal.append(average) > > Using the "for in []" idiom this can be written (if you prefer > comprehensions) as: > > smooth_signal = [average > for average in [initial_value] > for x in signal > for average in [(1-decay)*average + decay*x]] > > Try now to write this using PEP 572. The walrus operator turned to be > less suitable for solving the original problem because it doesn't help > to initialize the initial value. > > > Examples from PEP 572: > >> # Loop-and-a-half >> while (command := input("> ")) != "quit": >> print("You entered:", command) > > The straightforward way: > > while True: > command = input("> ") > if command == "quit": break > print("You entered:", command) > > The clever way: > > for command in iter(lambda: input("> "), "quit"): > print("You entered:", command) The 2-argument form of iter is under-remembered and under-used. The length difference is 8. while (command := input("> ")) != "quit": for command in iter(lambda: input("> "), "quit"): I like the iter version, but the for-loop machinery and extra function call makes a minimal loop half a millisecond slower. import timeit as ti def s(): it = iter(10000*'0' + '1') def w(): it = iter(10000*'0' + '1') while True: command = next(it) if command == '1': break def f(): it = iter(10000*'0' + '1') for command in iter(lambda: next(it), '1'): pass print(ti.timeit('s()', 'from __main__ import s', number=1000)) print(ti.timeit('w()', 'from __main__ import w', number=1000)) print(ti.timeit('f()', 'from __main__ import f', number=1000)) # 0.0009702129999999975 0.9365254250000001 1.5913117949999998 Of course, with added processing of 'command' the time difference disappears. Printing (in IDLE) is an extreme case. def wp(): it = iter(100*'0' + '1') while True: command = next(it) if command == '1': break print('w', command) def fp(): it = iter(100*'0' + '1') for command in iter(lambda: next(it), '1'): print('f', command) print(ti.timeit('wp()', 'from __main__ import wp', number=1)) print(ti.timeit('fp()', 'from __main__ import fp', number=1)) # 0.48 0.47 >> # Capturing regular expression match objects >> # See, for instance, Lib/pydoc.py, which uses a multiline spelling >> # of this effect >> if match := re.search(pat, text): >> print("Found:", match.group(0)) >> # The same syntax chains nicely into 'elif' statements, unlike the >> # equivalent using assignment statements. >> elif match := re.search(otherpat, text): >> print("Alternate found:", match.group(0)) >> elif match := re.search(third, text): >> print("Fallback found:", match.group(0)) > It may be more efficient to use a single regular expression which > consists of multiple or-ed patterns My attempt resulted in a slowdown. Duplicating the dominance of pat over otherpat over third requires, I believe, negative lookahead assertions. --- import re import timeit as ti ##print(ti.timeit('for y in {x}: pass', 'x=1')) ##print(ti.timeit('for y in [x]: pass', 'x=1')) ##print(ti.timeit('for y in (x,): pass', 'x=1')) ## ##print(ti.timeit('y=x', 'x=1')) ##print(ti.timeit('y=x; del y', 'x=1')) pat1 = re.compile('1') pat2 = re.compile('2') pat3 = re.compile('3') pat123 = re.compile('1|2(?!.*1)|3(?!.*(1|2))') # I think most people would prefer to use the 3 simple patterns. def ifel(text): match = re.search(pat1, text) if match: return match.group() match = re.search(pat2, text) if match: return match.group() match = re.search(pat3, text) if match: return match.group() def mach(text): match = re.search(pat123, text) return match.group() print([ifel('321'), ifel('32x'), ifel('3xx')] == ['1', '2', '3']) print([mach('321'), mach('32x'), mach('3xx')] == ['1', '2', '3']) # True, True text = '0'*10000 + '321' print(ti.timeit('ifel(text)', "from __main__ import ifel, text", number=100000)) print(ti.timeit('mach(text)', "from __main__ import mach, text", number=100000)) # 0.77, 7.22 > marked as different groups. When I put parens around 1, 2, 3 in pat123, the 2nd timeit continued until I restarted Shell. Maybe you can do better. > For example see the cute regex-based tokenizer in gettext.py: > >> _token_pattern = re.compile(r""" >> (?P<WHITESPACES>[ \t]+) | # spaces and >> horizontal tabs >> (?P<NUMBER>[0-9]+\b) | # decimal integer >> (?P<NAME>n\b) | # only n is allowed >> (?P<PARENTHESIS>[()]) | >> (?P<OPERATOR>[-*/%+?:]|[><!]=?|==|&&|\|\|) | # !, *, /, %, +, >> -, <, >, >> # <=, >=, ==, !=, >> &&, ||, >> # ? : >> # unary and >> bitwise ops >> # not allowed >> (?P<INVALID>\w+|.) # invalid token >> """, re.VERBOSE|re.DOTALL) >> >> def _tokenize(plural): >> for mo in re.finditer(_token_pattern, plural): >> kind = mo.lastgroup >> if kind == 'WHITESPACES': >> continue >> value = mo.group(kind) >> if kind == 'INVALID': >> raise ValueError('invalid token in plural form: %s' % value) >> yield value >> yield '' > I have not found any code similar to the PEP 572 example in pydoc.py. It > has different code: > >> pattern = re.compile(r'\b((http|ftp)://\S+[\w/]|' >> r'RFC[- ]?(\d+)|' >> r'PEP[- ]?(\d+)|' >> r'(self\.)?(\w+))') > ... >> start, end = match.span() >> results.append(escape(text[here:start])) >> >> all, scheme, rfc, pep, selfdot, name = match.groups() >> if scheme: >> url = escape(all).replace('"', '"') >> results.append('<a href="%s">%s</a>' % (url, url)) >> elif rfc: >> url = 'http://www.rfc-editor.org/rfc/rfc%d.txt' % int(rfc) >> results.append('<a href="%s">%s</a>' % (url, escape(all))) >> elif pep: > ... > > It doesn't look as a sequence of re.search() calls. It is more clear and > efficient, and using the assignment expression will not make it better. > >> # Reading socket data until an empty string is returned >> while data := sock.recv(): >> print("Received data:", data) > > for data in iter(sock.recv, b''): > print("Received data:", data) > >> if pid := os.fork(): >> # Parent code >> else: >> # Child code > > pid = os.fork() > if pid: > # Parent code > else: > # Child code > > > It looks to me that there is no use case for PEP 572. It just makes > Python worse. > -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4