"Eric S. Raymond" wrote: > > M.-A. Lemburg <mal@lemburg.com>: > > LOAD_FAST(124) : 19323126 ================================ > > SET_LINENO(127) : 15055591 ======================== > > LOAD_CONST(100) : 9254683 =============== > > LOAD_NAME(101) : 8218954 ============= > > LOAD_GLOBAL(116) : 7174876 =========== > > STORE_FAST(125) : 5927769 ========= > > POP_TOP( 1) : 5587424 ========= > > CALL_FUNCTION(131) : 5404709 ======== > > JUMP_IF_FALSE(111) : 5289262 ======== > > COMPARE_OP(106) : 4495179 ======= > > LOAD_ATTR(105) : 3481878 ===== > > BINARY_ADD( 23) : 3420811 ===== > > RETURN_VALUE( 83) : 2221212 === > > STORE_NAME( 90) : 2176228 === > > STORE_ATTR( 95) : 2085338 === > > BINARY_SUBSCR( 25) : 1834612 === > > JUMP_ABSOLUTE(113) : 1648327 == > > STORE_SUBSCR( 60) : 1446307 == > > JUMP_FORWARD(110) : 1014821 = > > BINARY_SUBTRACT( 24) : 910085 = > > POP_BLOCK( 87) : 806160 = > > STORE_GLOBAL( 97) : 779880 = > > FOR_LOOP(114) : 735245 = > > SETUP_LOOP(120) : 657432 = > > BINARY_MODULO( 22) : 610121 = > > 32( 32) : 530811 > > 31( 31) : 530657 > > BINARY_MULTIPLY( 20) : 392274 > > SETUP_EXCEPT(121) : 285523 > > Some thoughts: > > 1. That looks as close to a Poisson distribution as makes no difference. > I wonder what that means? I'd say that there are good chances on applying optimizations to the Python byte code -- someone with enough VC should look into this on a serious basis ;-) I think that highly optimized Python byte code compilers/ interpreters would make nice commercial products which complement the targetted Python+Batteries distros. > 2. Microtuning in the implementations of the top 3 opcodes looks indicated, > as they seem to constitute more than 50% of all calls. Separating out LOAD_FAST from the switch shows a nice effect. SET_LINENO is removed by -OO anyway, so there's really no use in optimizing this one. In my hacked up version I've also moved the signal handler into the second switch (along with SET_LINENO). The downside of this is that your program will only "see" signals if it happens to execute one of the less common opcodes, on the plus side you get an additional boost in performance -- if your app doesn't rely on signals to work, this is also a great way to squeeze out a little more performance. > 3. On the other hand, what do you get when you weight these by average > time per opcode? Haven't tested this, but even by simply reordering the cases according to the above stats you get a positive response from pybench and pystone. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4