> What version of CPython did you try that with? The latest py3k branch? I had a quick look at 3.2, 2.5 and 2.7 and got the impression that the savings is more if the interpreter loop is faster: the fewer instructions there are, the bigger a 3 instruction difference would make. The NEXTARG macro is the same in all three versions: #define NEXTARG() (next_instr += 2, (next_instr[-1]<<8) + next_instr[-2]) and the compiler compiles this to two separate fetches. I found out my compiler (gcc) will make better code if we used a short. It produces a "movswl" instruction to do both fetches at the same time, if I force it to. That saves two instructions already. This would imply that on little-endian machines, this would already save a few percent changing just 1 line of code in ceval.c: #define NEXTARG() (next_instr += 2, *(short *)&next_instr[-2]) - Jurjen
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4