Jit64: divwux - Prefer three-operand IMUL

By taking advantage of three-operand IMUL, we can eliminate a MOV
instruction. This is a small code size win. However, due to IMUL sign
extending the immediate value to 64 bits, we can only apply this when
the magic number's most significant bit is zero.

To ensure this can actually happen, we also minimize the magic number by
checking for trailing zeroes.

Example (Unsigned division by 18)
Before:
41 BE E4 38 8E E3    mov         r14d,0E38E38E4h
4D 0F AF F5          imul        r14,r13
49 C1 EE 24          shr         r14,24h

After:
4D 69 F5 39 8E E3 38 imul        r14,r13,38E38E39h
49 C1 EE 22          shr         r14,22h
This commit is contained in:
Sintendo 2021-05-05 22:54:56 +02:00
parent 4b827f3ae9
commit 2cafa0a960

View File

@ -1271,14 +1271,29 @@ void Jit64::divwux(UGeckoInstruction inst)
RCX64Reg Rd = gpr.Bind(d, RCMode::Write);
RegCache::Realize(Ra, Rd);
if (d == a)
magic++;
// Use smallest magic number and shift amount possible
while ((magic & 1) == 0 && shift > 0)
{
MOV(32, R(RSCRATCH), Imm32(magic + 1));
magic >>= 1;
shift--;
}
// Three-operand IMUL sign extends the immediate to 64 bits, so we may only
// use it when the magic number has its most significant bit set to 0
if ((magic & 0x80000000) == 0)
{
IMUL(64, Rd, Ra, Imm32(magic));
}
else if (d == a)
{
MOV(32, R(RSCRATCH), Imm32(magic));
IMUL(64, Rd, R(RSCRATCH));
}
else
{
MOV(32, Rd, Imm32(magic + 1));
MOV(32, Rd, Imm32(magic));
IMUL(64, Rd, Ra);
}
SHR(64, Rd, Imm8(shift + 32));