> >>c) ordering cases in the switch statements by usage frequency > >> (using average opcode usage frequencs obtained by > >> instrumenting the interpreter) > > > > I might try a little simulated annealing to generate layouts with high > > frequency opcodes near the front and coorcurring opcodes near each > > other. > > I did that by hand, sort of :-) The problem is that the > scoring phases takes rather long, so you better start with > a good guess. Im wondering what good scoring scheme would look like. I tried a scoring scheme in which layouts were scored thusly: for (i = 0; i < MAXOP; i++) for (j = 0; j < MAXOP; j++) score += pairfreq[layout[i]][layout[j]] * (i < j ? j-i : i-j) This works fine, but Im thinking that a simpler scoring scheme which looks only at the frequencies of adjacent ops might be sufficient, and would certainly be faster. for (i = 1; i < MAXOP; i++) score += pairfreq[layout[i-1]][layout[i]] The idea is that while caches favour locality of reference, because a cache line is finite in size and relatively small (16 or 64 bytes), there arent any long-range effects. In other words, caches favour adjacency of reference rather than locality of reference.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4