> > The fruit is a bit high. Doing a full module analysis means > > deferring the optimization for a second pass after all the code > > has already been generated. It's doable, but much harder. > > You're stuck in a one-pass compiler mindset. We build a parse tree > for the entire module before we start generating bytecode. We already > have tools to do namespace analysis for the entire tree (Jeremy added > these to implement nested scopes). . . . > > The task is much simpler if it can be known in advance that > > the substitution is allowed (i.e. a module level switch like: > > __fastbuiltins__ = True). > > -1000. Having ruled out a module level switch, the -O flag, and the -OO flag, that leaves the namespace analysis of the entire tree or taking an approach that doesn't change the bytecode. Taking the second approach, I've loaded a small patch for caching lookups into the __builtins__ namespace: www.python.org/sf/711722 It's not as fast as using LOAD_CONST, but is safe in all but one extreme case: calling the function, having an intervening poke into the __builtins__ module, and then calling the function again. I put the cache lookup in the safest possible place. It can be made twice as fast by putting it before the func_globals() lookup. That works in all cases except: calling the function, having an intervening shadowing global assignment, and then calling the function again. This doesn't come-up anywhere in the test suite, my own apps, or apps I've downloaded. Note, regular shadowing (before the first function call) continues to work fine. The bad news is that I've made many timings and found only modest speed-ups in real code. It turns out that access time for builtins is less significant than the time to call and execute those builtins. But, every little bit helps. Raymond Hettinger
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4