The only code which touches xfmem is code which writes directly into
uid_data.
All the rest now read their parameters out of uid_data.
I also simplified the lighting code so it always generated seperate
codepaths for alpha and color channels instead of trying to combine
them on the off-chance that the same equation works for all 4 channels.
As modern (post 2008) GPUs generally don't calcualte all 4 channels
in a single vector, this optimisation is pointless. The shader compiler
will undo it during the GLSL/HLSL to IR step.
Bug Fix: The about optimisation was also broken, applying the color light
equation to the alpha light channel instead of the alpha light
euqation. But doesn't look like anything trigged this bug.
- Fixes remaining lighting issues (Mario Tennis, etc)
- Apply same fixes to Software Renderer
- Corrected zero length light direction vector to resolve with normal direction (essentially becomes LIGHTDIF_NONE which was what I was after)
For whatever reason, the hardware doesn't do a full divide by 255, but
instead uses an approximation with shifting, similar to the way it is done
in TEV.