Fundamentally, all this does is enforce the invariant that we always
translate effective addresses based on the current BAT registers and
page table before we do anything else with them.
This change can be logically divided into three parts. The first part is
creating a table to represent the current BAT state, and keeping it up to
date (PowerPC::IBATUpdated, PowerPC::DBATUpdated, etc.). This does
nothing by itself, but it's necessary for the other parts.
The second part (mostly in MMU.cpp) is simply removing all the hardcoded
checks for specific untranslated addresses, and consistently translating
addresses using the current BAT configuration. Very straightforward, but a
lot of code changes because we hardcoded assumptions all over the place.
The third part (mostly in Memmap.cpp) is making the fastmem arena reflect
the current BAT configuration. We do this by redoing the mapping (calling
memmap()) based on the BAT table whenever it changes.
One additional minor change is that translation can fail in two ways:
either the segment is a direct store segment, or page table lookup failed.
The difference doesn't usually matter, but the difference affects cache
instructions, like dcbz.
The gather pipe optimization postpones checking the FIFO until the end
of the current block (or 32 bytes have been written). This is usually
safe, but is not correct across EIEIO instructions.
This is inferred from a block in NBA2K11 which synchronizes the FIFO
by writing a byte to it, executing eieio, and checking if PI_FIFO_WPTR
has changed. This is not currently an issue, but will become an issue
if the gather pipe optimization is applied to more stores.
Specifically, don't make any assumptions about what effective addresses
are used for code, and correctly handle changes to MSR.DR/MSR.IR.
(Split off from dynamic-bat.)
Also fold the check in both functionss into 'slowmem' rather than having
a separate test. (jo.alwaysUseMemFuncs implies jo.memcheck anyway, as
makes sense.)
Specifically, don't make any assumptions about what effective addresses
are used for code, and correctly handle changes to MSR.DR/MSR.IR.
(Split off from dynamic-bat.)
Simplification/reduction of duplicated code. Detect other constant GQR values and inline loads (5-10% speedup) and do direct dispatch to AOT methods for stores.
Fix Frame Advance and FifoPlayer pause/unpause/stop.
CPU::EnableStepping is not atomic but is called from multiple threads
which races and leaves the system in a random state; also instruction
stepping was unstable, m_StepEvent had an almost random value because
of the dual purpose it served which could cause races where CPU::Run
would SingleStep when it was supposed to be sleeping.
FifoPlayer never FinishStateMove()d which was causing it to deadlock.
Rather than partially reimplementing CPU::Run, just use CPUCoreBase
and then call CPU::Run(). More DRY and less likely to have weird bugs
specific to the player (i.e the previous freezing on pause/stop).
Refactor PowerPC::state into CPU since it manages the state of the
CPU Thread which is controlled by CPU, not PowerPC. This simplifies
the architecture somewhat and eliminates races that can be caused by
calling PowerPC state functions directly instead of using CPU's
(because they bypassed the EnableStepping lock).
We compile the blocks as they are executed, so it's common
to link them continuously. We end with calling JMP after every
block, but often just with a distance of 0.
So just emitting NOPs instead also "calls" the next block, but
easier for the CPU.