Commit Graph

2552 Commits

Author SHA1 Message Date
37b218b344 PowerPC: Fix bad copypaste in LookupTLBPageAddress
Fixes https://bugs.dolphin-emu.org/issues/12611.
2021-08-06 18:03:54 +02:00
125af42e4b Jit: Re-add dcbx masking
When making 92d1d60, I checked whether the ~0x1f masking in dcbx
actually was necessary. I came to the conclusion that it wasn't,
so I removed it. However, I hadn't checked the second half of
InvalidateICache closely enough - the masking is actually needed.

This commit re-adds the masking, but this time in C++ code instead
of in jitted code in order to save icache. Though I suppose the
difference doesn't matter all that much, since this is in farcode
and all...

Hopefully fixes https://bugs.dolphin-emu.org/issues/12612.
2021-08-06 14:55:07 +02:00
e753455abb JitArm64: Fix W8 slowmem store
Regression from 12629be.
2021-08-05 10:57:41 +02:00
942545b7fc Merge pull request #9964 from JosJuice/uncached-unaligned-writes
PowerPC: Implement broken masking for uncached unaligned writes
2021-08-04 22:23:07 +01:00
f333c0949f PowerPC: Implement PI interrupt for uncached unaligned writes 2021-08-04 23:09:43 +02:00
543ed8a97c PowerPC: Implement broken masking for uncached unaligned writes
This implements the behavior described in
https://bugs.dolphin-emu.org/issues/12565.

Thank you to eigenform, delroth, phire, marcan, segher, and Extrems
for all helping in one way or another with the efforts to reverse
engineer this behavior, and to Rylie for reporting the issue.
2021-08-04 23:04:02 +02:00
12629beff8 JitArm64: Call swap variants of memory write functions
Write_U16_Swap leaves the upper 32 bits alone. Reimplementing this
correctly in the JIT would require more than one instruction,
so let's just call Write_U16_Swap instead, like Jit64 does.
2021-08-04 23:04:02 +02:00
ecbce0a204 PowerPC: Pass on full 32-bit register contents for 8/16-bit writes 2021-08-04 23:04:02 +02:00
c56526d5f8 PowerPC: Keep track of write-through/cache-inhibited
One of the following commits will add emulation of a quirk
that only happens when writing to memory which is mapped as
write-through or cache-inhibited, so let's keep track of
which memory is mapped in this way.
2021-08-04 23:04:02 +02:00
0e76dabbbb Jit64: Always pass effective address to InvalidateICache() in dcbx. 2021-08-04 22:17:42 +02:00
5bd188d40d Jit64: Fix BATAddressLookup bit test
BT sets the carry flag, not the zero flag.
2021-08-03 17:32:04 +02:00
627832355e Merge pull request #9973 from JosJuice/jit-fma-negation-order
Jit: Use accurate negation order for FMA instructions
2021-07-31 20:18:49 -04:00
7c365349ee Merge pull request #9977 from JosJuice/jitarm64-mtfsfx
JitArm64: Implement mtfsfx
2021-07-31 17:54:48 -04:00
a90b0a1c93 JitArm64: Implement mtfsfx
The sixth and final part of implementing the FPSCR system register
instructions.
2021-07-31 23:50:20 +02:00
5c5de35568 Jit: Use ibat_table for dcbf/dcbi/dcbst address check
Minor mistake in 92d1d60. We should be using ibat_table instead
of dbat_table, since we're dealing with invalidating icache.
2021-07-31 14:30:03 +02:00
bef1fdb4cb Merge pull request #9974 from JosJuice/jitarm64-mtfsfix
JitArm64: Implement mtfsfix
2021-07-31 03:56:51 -04:00
a208ff5aab Merge pull request #9957 from JosJuice/dcbx-faster
Jit: Perform BAT lookup in dcbf/dcbi/dcbst
2021-07-31 03:27:24 +02:00
0e62dac4bb JitArm64: Implement mtfsfix
Part 5 of 6 of implementing the FPSCR system register instructions.
2021-07-30 10:24:41 +02:00
08b358a829 Jit64: Fix minor fmaddXX inefficiencies 2021-07-29 23:34:20 +02:00
93e636abc3 Jit: Use accurate negation order for FMA instructions
It was believed that this only mattered when the rounding mode was
set to round to infinity, which games generally don't do, but it
can also affect the sign of the output when the inputs are all zero.
2021-07-29 23:33:35 +02:00
c86c02e46b Merge pull request #9960 from JosJuice/jitarm64-mtfsb1x
JitArm64: Implement mtfsb1x
2021-07-28 20:46:09 -04:00
3bb4a4e344 Jit64: Fix fmaddXX with accurate NaNs
So it turns out you have to pass XMM0 as the clobber register
to HandleNaNs, because HandleNaNs uses BLENDVPD and BLENDVPD
implicitly uses XMM0, and nobody noticed when I broke this in
2c38d64 because nobody plays the one game that needs accurate NaNs.
2021-07-28 23:03:03 +02:00
ca55d599e8 Jit: Mark ValidBlockBitSet::Test as const 2021-07-27 11:11:30 +02:00
c9a4021537 JitArm64: Implement mtfsb1x
Part 4 of implementing the FPSCR system register instructions.
2021-07-25 19:18:43 +02:00
92d1d60ff1 Jit: Perform BAT lookup in dcbf/dcbi/dcbst
When 66b992c fixed https://bugs.dolphin-emu.org/issues/12133,
it did so by removing the broken address calculation entirely and
always using the slow path. This caused a performance regression,
https://bugs.dolphin-emu.org/issues/12477.

This commit instead replaces the broken address calculation with
a BAT lookup. If the BAT lookup succeeds, we can use the old fast
path. Otherwise we use the slow path.

Intends to improve https://bugs.dolphin-emu.org/issues/12477.
2021-07-25 15:15:15 +02:00
b84a0704cd Revert "Jit: Fix correctness issue in dcbf/dcbi/dcbst"
This reverts commit 66b992cfe4.

A new (additional) correctness issue was revealed in the old
AArch64 code when applying it on top of modern JitArm64:
LSR was being used when LSRV was intended. This commit uses LSRV.
2021-07-25 15:13:57 +02:00
f380c23fda Merge pull request #9890 from JosJuice/jitarm64-mtfsb0x
JitArm64: Implement mtfsb0x
2021-07-22 21:41:01 -04:00
5af5656383 Merge pull request #9932 from JosJuice/jitarm64-dcbz-backpatch
JitArm64: Fix dcbz backpatch
2021-07-23 01:58:59 +02:00
fdcea8566d JitArm64: Improve Arm64FPRCache::GetCallerSavedUsed
If we're only using the lower 64 bits of a callee-saved
register, GetCallerSavedUsed can return false for it.
2021-07-22 10:42:44 +02:00
1df3456267 JitArm64: Remove a comment in dcbz implementation
This implementation is pretty efficient in my opinion. And "As
long as we aren't falling back to interpreter we're winning a lot"
applies to basically every instruction to some degree anyway.
2021-07-21 19:24:41 +02:00
d91d6fcdc5 JitArm64: Fix dcbz backpatch
The dcbz instruction needs to lock W30 so that the slowmem code will
push and pop it when calling into C++. Also, the slowmem code expects
that the address is present in W0, so replace the use of W0 as a scratch
register in the fastmem code with the now locked W30.
2021-07-21 19:19:52 +02:00
302b47f5e6 JitArm64: Add temp reg parameter to Arm64RegCache::Flush
We currently have a bug when calling Arm64GPRCache::Flush with
FlushMode::MaintainState, zero free host registers, and at least
one guest register containing an immediate. We end up grabbing
a temporary register from the register cache in order to be
able to write the immediate to memory, but grabbing a temporary
register when there are zero free registers causes the least
recently used register to be flushed in a way which does not
maintain the state of the register cache.

To get around this, require callers to pass in a temporary
register in the GPR MaintainState case. In other cases,
passing in a temporary register is not required but can help
avoid spilling a register (if the caller already had a
temporary register at hand anyway, which in particular will
be the case in my upcoming memcheck pull request).
2021-07-21 16:28:19 +02:00
c991904e04 PowerPC: Add reservation monitor to save state 2021-07-21 12:14:07 +02:00
d763d693e8 PowerPC: Move lwarx/stwcxd. reservation into PowerPCState 2021-07-21 12:12:19 +02:00
b2d87c49b6 JitArm64: Implement mtfsb0x
Part 3 of implementing the FPSCR system register instructions.
2021-07-21 09:22:13 +02:00
8bddd8c675 remove SetRoundMode
we only care about SSE rounding mode, and set
that manually in SetSIMDMode
2021-07-17 19:29:22 -07:00
197075293d make FPSCR.RN an enum 2021-07-17 18:55:06 -07:00
3af21d3d22 JitArm64: Optimize FloatCompare's CR value emitting
Setting bit 32 is only needed in the case where EQ and GT are set
but SO and LT are not, which is not a possible outcome of a compare.
2021-07-12 22:54:37 +02:00
8af5095ff4 JitArm64: Stop using hand-encoded logical immediates 2021-07-12 22:25:49 +02:00
88fd9fd577 Merge pull request #9869 from JosJuice/jitarm64-constexpr-isimmlogical
JitArm64: Encode logical immediates at compile-time where possible
2021-07-11 12:55:48 +02:00
f903853cf7 JitArm64: Fix ps_cmpXX
Passing a width of 64 and registers encoded as double to
DUP resulted in an invalid instruction. The registers should
be encoded as quads in this situation.

Fixes https://bugs.dolphin-emu.org/issues/12575.
2021-07-11 11:43:19 +02:00
0f3b9a8874 JitArm64: Minor mcrfs optimization 2021-07-10 20:44:22 +02:00
9e80db123f JitArm64: Encode logical immediates at compile-time where possible
Manually encoding and decoding logical immediates is error-prone.
Using ORRI2R and friends lets us avoid doing the work manually,
but in exchange, there is a runtime performance penalty. It's
probably rather small, but still, it would be nice if we could
let the compiler do the work at compile-time. And that's exactly
what this commit does, so now I have no excuse for trying to
manually write logical immediates anymore.
2021-07-10 20:43:59 +02:00
f6ca70d094 Merge pull request #9822 from JosJuice/jitarm64-ps-cmpxx
JitArm64: Implement ps_cmpXX
2021-07-10 19:20:48 +02:00
adbf6d55da JitArm64: Implement ps_cmpXX 2021-07-10 19:08:55 +02:00
4ba4d7cc7d Merge pull request #9878 from JosJuice/jitarm64-addmex
JitArm64: Implement addmex/subfmex
2021-07-10 10:11:20 +02:00
fc60e62622 JitArm64: Implement addmex/subfmex 2021-07-09 16:44:33 +02:00
c9e4489e17 Core/MMU: Fix inverted condition in HostIsInstructionRAMAddress(). 2021-07-09 05:48:17 +02:00
cfcc994f6c Merge pull request #9840 from JosJuice/jitarm64-mffsx
JitArm64: Implement mffsx
2021-07-08 14:15:24 +02:00
a390d3f327 Merge pull request #9820 from JosJuice/jitarm64-simplify-addex
JitArm64: Simplify addex/subfex
2021-07-08 13:46:48 +02:00