A.M. Kuchling <amk <at> amk.ca> writes: > > threaded code: A technique for implementing virtual machine > interpreters, introduced by J.R. Bell in 1973, where each op-code in > the virtual machine instruction set is the address of some (lower > level) code to perform the required operation. This kind of virtual > machine can be implemented efficiently in machine code on most > processors by simply performing an indirect jump to the address which > is the next instruction. Is this kind of optimization that useful on modern CPUs? It helps remove a memory access to the switch/case lookup table, which should shave off the 3 CPU cycles of latency of a modern L1 data cache, but it won't remove the branch misprediction penalty of the indirect jump itself, which is more in the order of 10-20 CPU cycles depending on pipeline depth. In 1973, CPUs were not pipelined and did not suffer any penalty for indirect jumps, while lookups could be slow especially if they couldn't run in parallel with other processing in the pipeline. Thanks Antoine.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4