JIT: more optimizing of float ops based on known input characteristics

If the inputs are both float singles, and the top half is known to be identical to the bottom half, we can use packed arithmetic instead of scalar to skip the movddup. This is slower on a few rather old CPUs, plus the Atom+Silvermont, so detect Atom and disable it in that case. Also avoid PPC_FP on stores if we know that the output came from a float op.
2025-09-13 06:52:58 -06:00 · 2014-10-11 14:22:44 -07:00
parent 4e0591cdf1
commit 72c96c20d3
9 changed files with 154 additions and 39 deletions
--- a/Source/Core/Common/CPUDetect.h
+++ b/Source/Core/Common/CPUDetect.h
@ -50,10 +50,10 @@ struct CPUInfo
 	bool bMOVBE;
 	// This flag indicates that the hardware supports some mode
 	// in which denormal inputs _and_ outputs are automatically set to (signed) zero.
-	// TODO: ARM
 	bool bFlushToZero;
 	bool bLAHFSAHF64;
 	bool bLongMode;
+	bool bAtom;

 	// ARM specific CPUInfo
 	bool bSwp;