Here's a first draft of a PEP on the subject we recently discussed on python-dev. I intend to do a few more rounds here and then post the PEP for final discussion to python-list and python-dev. Barry, could you assign a PEP number ? Thanks. -- PEP: 02XX Title: Switching on Multiple Values Version: $Revision: 1.0 $ Author: mal@lemburg.com (Marc-Andre Lemburg) Status: Draft Type: Standards Track Python-Version: 2.3 Created: 10-Nov-2001 Post-History:=20 Abstract This PEP proposes strategies to enhance Python's performance with respect to handling switching on a single variable having one of multiple possible values. Problem Up to Python 2.1, the typical way of writing multi-value switches=20 has been to use long switch constructs of the following type: if x =3D=3D 'first state': ... elif x =3D=3D 'second state': ... elif x =3D=3D 'third state': ... elif x =3D=3D 'fourth state': ... else: # default handling ... This works fine for short switch constructs, since the overhead of repeated loading of a local (the variable x in this case) and comparing it to some constant is low (it has a complexity of O(n) on average). However, when using such a construct to write a state machine such as is needed for writing parsers the number of possible states can easily reach 10 or more cases. The current solution to this problem lies in using a dispatch table to find the case implementing method to execute depending on the value of the switch variable (this can be tuned to have a complexity of O(1) on average, e.g. by using perfect hash tables). This works well for state machines which require complex and lengthy processing in the different case methods. It does perform well for ones which only process one or two instructions per case, e.g. def handle_data(self, data): self.stack.append(data) =20 A nice example of this is the state machine implemented in pickle.py which is used to serialize Python objects. Other prominent cases include XML SAX parsers and Internet protocol handlers. Proposed Solutions This PEP proposes two different but not necessarily conflicting solutions: 1. Adding an optimization to the Python compiler and VM which detects the above if-elif-else construct and generates special opcodes for it which use an read-only dictionary for storing jump offsets. 2. Adding new syntax to Python which mimics the C style switch statement. The first solution has the benefit of not relying on new keywords to the language, while the second looks cleaner. Both involve some run-time overhead to assure that the switching variable is immutable and hashable. Solution 1: Optimizing if-elif-else XXX This section currently only sketches the design. Issues: The new optimization should not change the current Python semantics (by reducing the number of __cmp__ calls and adding __hash__ calls in if-elif-else constructs which are affected by the optimiztation). To assure this, switching can only safely be implemented either if a "from __future__" style flag is used, or the switching variable is one of the builtin immutable types: int, float, string, unicode, etc. To prevent post-modifications of the jump-table dictionary (which could be used to reach protected code), the jump-table will have to be a read-only type (e.g. a read-only dictionary). The optimization should only be used for if-elif-else constructs which have a minimum number of n cases (where n is a number which has yet to be defined depending on performance tests). Implementation: It should be possible for the compiler to detect an if-elif-else construct which has the following signature: if x =3D=3D 'first':... elif x =3D=3D 'second':... else:... (ie. the left hand side alwys references the same variable, the right hand side some hashable immutable builtin type) The compiler could then setup a read-only (perfect) hash table, store it in the constants and add an opcode SWITCH which triggers the following run-time behaviour: At runtime, SWITCH would check x for being one of the well-known immutable types (strings, unicode, numbers) and use the hash table for finding the right opcode snippet. Solutions 2: Adding a switch statement to Python XXX This section currently only sketches the design. Syntax: switch EXPR: case CONSTANT: [suite] case CONSTANT: [suite] ... else: [suite] (modulo indentation variations) Implementation: The compiler would have to generate code similar to this: def whatis(x): switch(x): case 'one':=20 print '1' case 'two':=20 print '2' case 'three':=20 print '3' else:=20 print "D'oh!" into (ommitting POP_TOP's and SET_LINENO's): 6 LOAD_FAST 0 (x) 9 LOAD_CONST 1 (switch-table-1) 12 SWITCH 26 (to 38) 14 LOAD_CONST 2 ('1') 17 PRINT_ITEM 18 PRINT_NEWLINE 19 JUMP 43 22 LOAD_CONST 3 ('2') 25 PRINT_ITEM 26 PRINT_NEWLINE 27 JUMP 43 30 LOAD_CONST 4 ('3') 33 PRINT_ITEM 34 PRINT_NEWLINE 35 JUMP 43 38 LOAD_CONST 5 ("D'oh!") 41 PRINT_ITEM 42 PRINT_NEWLINE >>43 LOAD_CONST 0 (None) 46 RETURN_VALUE =20 Where the 'SWITCH' opcode would jump to 14, 22, 30 or 38 depending on 'x'. Issues: The switch statement should not implement fall-through behaviour (as does the switch statement in C). Each case defines a complete and independent suite; much like in a if-elif-else statement. This also enables using break in switch statments inside loops. There have been other proposals for the syntax which reuse existing keywords and avoid adding two new ones ("switch" and "case"). Others have argued that the keywords should use new terms to avoid confusion with the C keywords of the same name but slightly different semantics (e.g. fall-through without break). Some of the proposed variants: case EXPR: of CONSTANT: [suite] of CONSTANT: [suite] else: [suite] case EXPR: if CONSTANT: [suite] if CONSTANT: [suite] else: [suite] when EXPR: in CONSTANT_TUPLE: [suite] in CONSTANT_TUPLE: [suite] ... else: [suite] =20 The switch statement could be extended to allow tuples of values for one section (e.g. case 'a', 'b', 'c': ...). Another proposed extension would allow ranges of values (e.g. case 10..14: ...). These should probably be post-poned, but already kept in mind when designing and implementing a first version. Scope XXX Explain "from __future__ import switch" Credits Martin von L=F6wis (issues with the optimization) Thomas Wouters (switch statement + byte code compiler example) Skip Montanaro (dispatching ideas) Donald Beaudry (switch syntax) Greg Ewing (switch syntax) Copyright This document has been placed in the public domain. =0C Local Variables: mode: indented-text indent-tabs-mode: nil End: --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4