> Modified Files: > ceval.c > Log Message: > A 2% speed improvement with gcc on low-endian machines. My guess is that > this > new pattern for NEXTARG() is detected and optimized as a single (*short) > loading. It is possible to verify that guess by looking at the generated assembler. There are other possible reasons. One is that the negative array offsets don't compile well into a native addressing mode of base+offset*wordsize. I have seen and proven that is the case in other parts of the code base. The other reason for the speedup is that pre-incrementing the pointer prevented the lookup from being done in parallel (i.e. a sequential dependency was present). If the latter reason is a true cause, then part of the checkin is counter-productive. The change to PREDICTED_WITH_ARG introduces a pre-increment in addition to the post-increment. Please run another timing with and without the change to PREDICTED_WITH_ARG. I suspect the old way ran faster. Also, the old way will always be faster on big-endian machines and would be faster on machines with less sophisticated compilers (and possibly slower on MSVC++ if it doesn't automatically generate a load short). Another consideration is that loading a short may perform much differently on other architectures because even alignment only occurs half of the time. Summary: +1 on the changes to NEXT_ARG and EXTENDED_ARG; -1 on the change to PREDICTED_WITH_ARG. Raymond Hettinger > #define PREDICTED(op) PRED_##op: next_instr++ > ! #define PREDICTED_WITH_ARG(op) PRED_##op: oparg = (next_instr[2]<<8) > + \ > ! next_instr[1]; next_instr += 3 > > /* Stack manipulation macros */ > --- 660,664 ---- > > #define PREDICTED(op) PRED_##op: next_instr++ > ! #define PREDICTED_WITH_ARG(op) PRED_##op: next_instr++; oparg = > OPARG(); next_instr += OPARG_SIZE
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4