Each conditional branch instruction contains a 16 bit field for recording the direction of the last 16 jumps.
uint16_t val = *branches;
val = (val << 1) | direction;
*branches =val;
This tells the direction of the last 16 branches.
When recording traces for the JIT, 16 branches is not a lot of information to guide region selection.
Instead, we could record the count of the two directions instead using a pair of saturating 8 bit counters.
branches[direction] += (branches[direction] != 255);
Which is only one extra ALU instruction for most C compilers, and no extra memory accesses.
With a JIT warmup of up 500, we can get precise numbers of the branches taken.
With a warmup of a 1000, the saturating counters can still distinguish between branches that are rarely taken and those which are more balanced.
For example, if a branch switches direction every 20 times, the current counter might show a perfectly biased branch, but the saturating counter approach will show that the branch is roughly balanced.
Each conditional branch instruction contains a 16 bit field for recording the direction of the last 16 jumps.
This tells the direction of the last 16 branches.
When recording traces for the JIT, 16 branches is not a lot of information to guide region selection.
Instead, we could record the count of the two directions instead using a pair of saturating 8 bit counters.
Which is only one extra ALU instruction for most C compilers, and no extra memory accesses.
With a JIT warmup of up 500, we can get precise numbers of the branches taken.
With a warmup of a 1000, the saturating counters can still distinguish between branches that are rarely taken and those which are more balanced.
For example, if a branch switches direction every 20 times, the current counter might show a perfectly biased branch, but the saturating counter approach will show that the branch is roughly balanced.