OR allows for a more compact representation for constants that can be
represented by a signed 8-bit integer, while MOV does not. By letting
MOV handle the larger constants we can occasionally save a byte.
Before:
45 8B F5 mov r14d,r13d
41 81 CE 00 80 01 00 or r14d,18000h
After:
41 BE 00 80 01 00 mov r14d,18000h
45 0B F5 or r14d,r13d
Bitwise or with zero is just a fancy MOV, really.
- Example 1
Before:
41 BA 00 00 00 00 mov r10d,0
45 0B D1 or r10d,r9d
After:
45 8B D1 mov r10d,r9d
- Example 2
Before:
41 83 CA 00 or r10d,0
After:
Nothing!
AND allows for a more compact representation for constants that can be
represented by a signed 8-bit integer, while MOV does not. By letting
MOV handle the larger constants we can occasionally save a byte.
Before:
41 8B FE mov edi,r14d
81 E7 FF FE FF FF and edi,0FFFFFEFFh
After:
BF FF FE FF FF mov edi,0FFFFFEFFh
41 23 FE and edi,r14d
XOR allows for a more compact representation for constants that can be
represented by a signed 8-bit integer, while MOV does not. By letting
MOV handle the larger constants we can occasionally save a byte.
Before:
44 89 F7 mov edi,r14d
81 F7 A0 52 57 01 xor edi,15752A0h
After:
BF A0 52 57 01 mov edi,15752A0h
41 33 FE xor edi,r14d
In the case of eqvx, the final complement can always be baked directly
into the immediate value.
Before:
45 8B EF mov r13d,r15d
41 F7 D5 not r13d
41 83 F5 04 xor r13d,4
After:
45 8B EF mov r13d,r15d
41 83 F5 FB xor r13d,0FFFFFFFBh
PowerPC instructions andcx and orcx complement the value of register b
before performing their respective bitwise operation. If this register
happens to contain a known value, we can precompute the complement,
allowing us to generate simpler code.
- andcx
Before:
BF 00 01 00 00 mov edi,100h
F7 D7 not edi
41 23 FE and edi,r14d
After:
41 8B FE mov edi,r14d
81 E7 FF FE FF FF and edi,0FFFFFEFFh
- orc
Before:
41 BE 04 00 00 00 mov r14d,4
41 F7 D6 not r14d
45 0B F5 or r14d,r13d
After:
45 8B F5 mov r14d,r13d
41 83 CE FB or r14d,0FFFFFFFBh
For certain occurrences of nandx/norx, we declare a ReadWrite constraint
on the destination register, even though the value of the destination
register is irrelevant. This false dependency would force the RegCache
to generate a redundant MOV when the destination register wasn't already
assigned to a host register.
Example 1:
BF 00 00 00 00 mov edi,0
8B FE mov edi,esi
F7 D7 not edi
Example 2:
8B 7D 80 mov edi,dword ptr [rbp-80h]
8B FE mov edi,esi
F7 D7 not edi
FinalizeCarryOverflow didn't maintain XER[OV/SO] properly due to an
oversight. Here's the code it would generate:
0: 9c pushf
1: 80 65 3b fe and BYTE PTR [rbp+0x3b],0xfe
5: 71 04 jno b <jno>
7: c6 45 3b 03 mov BYTE PTR [rbp+0x3b],0x3
000000000000000b <jno>:
b: 9d popf
At first glance it seems reasonable. The host flags are carefully
preserved with PUSHF. The AND instruction clears XER[OV]. Next, an
conditional branch checks the host's overflow flag and, if needed, skips
over a MOV that sets XER[OV/SO]. Finally, host flags are restored with
POPF.
However, the AND instruction also clears the host's overflow flag. As a
result, the branch that follows it is always taken and the MOV is always
skipped. The end result is that XER[OV] is always cleared while XER[SO]
is left unchanged.
Putting POPF immediately after the AND would fix this, but we already
have GenerateOverflow doing it correctly (and without the PUSHF/POPF
shenanigans too). So let's just use that instead.