This adds about a frame of latency, and since most games don't change
VI registers during scanout, we can get away with outputting the XFB at
the start of scanout. WWE Crush Hour is the (only currently known)
exception, which has flickering problems when doing it this way.
This adds a path to perform the output at the end of scanout, and gates
it behind an option which defaults to using the latency-reducing
pre-scanout path.
Originally, 1479 (for example) would disassemble as `lsr $ACC0, #-7`. At some point (likely the conversion to fmt), this regressed to `lsr $ACC0, #4294967289`. Now, it disassembles as `lsr $ACC0, #7`.
The CPU-side AX library enables it by default and uses hardcoded parameters.
CMD_COMPRESSOR_TABLE_ADDR (0x0A) was incorrect. It's always a nop on the
GameCube and was probably confused with the Wii version.
It was believed that this only mattered when the rounding mode was
set to round to infinity, which games generally don't do, but it
can also affect the sign of the output when the inputs are all zero.
So it turns out you have to pass XMM0 as the clobber register
to HandleNaNs, because HandleNaNs uses BLENDVPD and BLENDVPD
implicitly uses XMM0, and nobody noticed when I broke this in
2c38d64 because nobody plays the one game that needs accurate NaNs.
This was added because YAGCD's info on MAXANISO (near TX_SETMODE0 in Section 5.11.1) claims it's the case, but Extrems says it does work. I haven't tested anything myself, and dolphin still does not actually implement anisotropic filtering based on this field.
This reverts commit 66b992cfe4.
A new (additional) correctness issue was revealed in the old
AArch64 code when applying it on top of modern JitArm64:
LSR was being used when LSRV was intended. This commit uses LSRV.
The workaround added in 30f9f31 caused a regression where Dolphin
incorrectly replaced runs of one byte with runs of another byte
when writing WIA and RVZ files. ReuseID::operator< was always
returning false unless the ReuseIDs being compared had different
partition keys, which caused std::map<ReuseID, GroupEntry>
to treat all ReuseIDs with the same partition key as equal.