JIT: more optimizing of float ops based on known input characteristics

If the inputs are both float singles, and the top half is known to be identical
to the bottom half, we can use packed arithmetic instead of scalar to skip
the movddup.

This is slower on a few rather old CPUs, plus the Atom+Silvermont, so detect
Atom and disable it in that case.

Also avoid PPC_FP on stores if we know that the output came from a float op.
This commit is contained in:
Fiora
2014-10-11 14:22:44 -07:00
parent 4e0591cdf1
commit 72c96c20d3
9 changed files with 154 additions and 39 deletions

View File

@ -50,10 +50,10 @@ struct CPUInfo
bool bMOVBE;
// This flag indicates that the hardware supports some mode
// in which denormal inputs _and_ outputs are automatically set to (signed) zero.
// TODO: ARM
bool bFlushToZero;
bool bLAHFSAHF64;
bool bLongMode;
bool bAtom;
// ARM specific CPUInfo
bool bSwp;