Commit Graph

70 Commits

Author SHA1 Message Date
degasus
304e601ad3 JitArm64: Reimplement aarch64 cycle counters.
CNTVCT_EL0 is force-enabled on all linux plattforms.
Windows is untested, but as this is the best way to get *any* low
overhead performance counters, they likely use it as well.
2017-09-02 13:24:37 +02:00
degasus
b00c60618b JitArm64: Fix rlwinmx.
Seems like I was wrong that ANDI2R doesn't require a temporary register here.
There is *one* case when the mask won't fit in the ARM AND instruction:
mask = 0xFFFFFFFF
But let's just use MOV instead of AND here for this case...
2017-08-22 08:47:43 +02:00
Markus Wick
d78009877b JitArm64: Fix LSL/LSR/ROR/ASR wrappers.
The other method has a latency of 2 cycles. This also improves the
throughput a lot.
2017-08-12 00:00:41 +02:00
Tillmann Karras
c54c49714d Arm64Emitter: add FRECPE 2017-05-03 08:02:35 +01:00
Michael Maltese
3d7bace9da Arm64Emitter: extract lambda to AddImmediate()
Fixes warning:

```
Source/Core/Common/Arm64Emitter.cpp:4108:31: error: declaration shadows a local variable [-Werror,-Wshadow]
    auto addi = [this](ARM64Reg Rd, ARM64Reg Rn, u64 imm, bool shift, bool negative, bool flags) {
                                ^
  /var/lib/buildbot/slave/pr-android/build/Source/Core/Common/Arm64Emitter.cpp:4105:46: note: previous declaration is here
  void ARM64XEmitter::ADDI2R_internal(ARM64Reg Rd, ARM64Reg Rn, u64 imm, bool negative, bool flags,
                                               ^
```
2017-03-25 14:21:19 -07:00
Michael Maltese
c58ba93503 Arm64: Use PRIi64/PRIx64 for printf 2017-03-25 14:20:44 -07:00
degasus
6aa54a029e JitArm64: Optimize GPR register push/pop. 2017-02-11 00:59:12 +01:00
BhaaL
23d99f2f2c specify custom brace style to fix unions
BreakBeforeBraces: Allman apparently includes all styles,
except for AfterUnion (which is false) when using clang-format -dump-config
2017-01-05 12:55:13 +01:00
Léo Lam
31ccfffd38 Common: Add alignment header
Gets rid of duplicated alignment code.
2016-12-06 20:33:53 +01:00
degasus
8ad98d0046 ArmEmitter: Merge AddI2R helpers. 2016-10-27 19:19:06 +02:00
degasus
694e9b4132 JitArm64: ADDI2R optimizations 2. 2016-10-27 19:19:06 +02:00
degasus
1df694626d JitArm64: Optimize addi2r & subi2r. 2016-10-26 21:54:13 +02:00
degasus
df250b84cc JitArm64: Avoid MOVI2R is possible.
Just use all kind of ADDI2R, SUBI2R, ...
They have some optimizations internally.
2016-10-26 21:54:09 +02:00
degasus
7c9bba2213 Arm64Emitter: Fix std::array initializer. 2016-09-26 22:17:25 +02:00
Bernhard Urban
976da3707a arm64: add comment about data cache flushing 2016-09-10 08:05:16 +02:00
Bernhard Urban
fff8221b63 arm64: fixes around icache flushing 2016-09-10 02:31:07 +02:00
Pierre Bourdon
3570c7f03a Reformat all the things. Have fun with merge conflicts. 2016-06-24 10:43:46 +02:00
degasus
9ed465f4ac JitArm64: Implement mulhwx 2016-03-04 22:51:46 +01:00
mathieui
f15ffda5a7 Correct ampersands as well 2016-01-21 21:27:56 +01:00
mathieui
3e283ea9f1 More asterisks 2016-01-21 21:16:51 +01:00
mathieui
78aa398e7c Common: asterisks go against the type name
not the variable name
2016-01-21 20:46:25 +01:00
Lioncash
2630752ffe Arm64Emitter: Get rid of a pointer cast 2015-10-22 15:32:11 -04:00
Lioncash
018c85c248 Arm64Emitter: Mark trivial functions as constexpr 2015-10-22 15:22:38 -04:00
Lioncash
19ac565e0d Common: Move asserts to their own header 2015-09-26 18:51:27 -04:00
Ryan Houdek
2ad26ab3e9 [AArch64] Fix Test&Branch to relative location instructions.
Wasn't masking by the size of the offset encoding so negative values were killing the instruction
Missed commiting this in my integer gatherpipe PR.
Fixes crashing on AArch64.
2015-09-07 13:38:58 -05:00
Ryan Houdek
d003934b8a Merge pull request #2929 from Sonicadvance1/aarch64_optimize_gpr_flush
Aarch64 optimize gpr flush
2015-08-31 10:55:45 -05:00
Ryan Houdek
f2c17436ab [AArch64] Fix issue in emitter.
Loadstore pairs support only signed offsets, not unsigned.
2015-08-30 23:05:59 -05:00
Ryan Houdek
b907576510 [AArch64] Support profiling by cycle counters if they are available to EL0 2015-08-30 10:25:16 -05:00
Ryan Houdek
4fa23abbe1 [AArch64] Implement MOVI and ORR(imm) in the NEON emitter. 2015-08-23 15:34:53 -05:00
degasus
9bfff0d461 JitArm64: Fix jit clearing
We have to reset m_lastCacheFlushEnd on clearing.
2015-08-15 11:41:01 +02:00
Lioncash
144ea9f4aa Arm64Emitter: Fix encoding of '2-reg misc' variant of FCMEQ 2015-08-10 19:48:36 -04:00
Ryan Houdek
922d476dab [AArch64] Fix FCMGE instruction encoding.
Fixes a crash when ps_sel is used (PSO 1&2 intro movies).
2015-08-09 14:54:55 -05:00
degasus
b8dd68beef JitArm64: Far Code Cache 2015-07-12 09:41:32 +02:00
Lioncash
d09d59007a Arm64Emitter: Add a missing const specifier for an array table 2015-07-02 11:09:44 -04:00
Ryan Houdek
afc3d30f5c [AArch64] Implement BFI & UBFIZ in the emitter.
Also fixes a bug in the UBFX instruction emitter. Naughty Naughty PPSSPP, not testing emitter functions you add.
2015-06-29 19:00:22 -05:00
Ryan Houdek
5dc148159f [AArch64] Implement {U, S}QXTN{,2}
Also split out XTN to XTN and XTN2.
2015-06-13 23:16:17 -05:00
Lioncash
74b359e390 Arm64Emitter: Remove unused variable from EncodeLoadStoreRegisterOffset 2015-06-13 14:27:15 -04:00
Ryan Houdek
3d2b116323 [AArch64] Implement a couple instructions in the emitter.
Implements LD2R.
Implements LD1R/LD2R with post-indexing support.
Implements vector min/max instructions.
2015-06-09 18:10:56 -05:00
Ryan Houdek
8ae12d8005 [AArch64] Add ASIMD LDR/STR with register offset 2015-06-07 19:53:05 -05:00
Ryan Houdek
05b72c5d31 [AArch64] Upstream PPSSPP's emitter changes.
Requires a minor change to in the JIT to make sure everything still works.
2015-06-07 19:50:21 -05:00
Tillmann Karras
30ebb2459e Set copyright year to when a file was created 2015-05-25 13:22:31 +02:00
Tillmann Karras
cefcb0ace9 Update license headers to GPLv2+ 2015-05-25 13:22:31 +02:00
Ryan Houdek
f6511c3ba5 [AArch64] Add an assert to SMOV in the emitter.
SMOV doesn't have an encoding for moving a 32bit element to a 32bit GPR.
One should use UMOV if they want that.
2015-03-08 12:29:45 -05:00
Ryan Houdek
fbdee7b15f [AArch64] Handle FPR island registers in a less dumb way. 2015-03-03 00:30:05 -06:00
Ryan Houdek
f1a9db9bdc [AArch64] Stop violating the AAPCS64 so much. 2015-03-02 11:21:15 -06:00
Ryan Houdek
fad46729b0 [AArch64] Implemented paired pushing/popping for the VFP.
A bit more efficient if we are only pushing two VFP registers.
We can probably be a bit more efficient in the future by mixing paired loadstores in to the other paths as well.
2015-03-02 06:27:47 -06:00
Ryan Houdek
39e357d62d [AArch64] Implement VFP loadstore paired in the emitter. 2015-03-02 06:27:17 -06:00
Ryan Houdek
8b8310d28c [AArch64] Optimize FPR pushing and popping.
Previously on FPR pushing and popping we would do a single STR/LDR per quad FPR we wanted to push/pop.
In most of our cases when we are pushing and popping VFP registers they will be consecutive registers that will save more efficiently using the NEON
loadstores that can do up to four quad registers.
So this can potentially cutting instructions down to ~1/4th the amount of instructions if the registers are all consecutive.

On the Cortex-A57 this is basically just an icache improvement, but on the Nvidia Denver this may be optimized to be more efficient. Either way it's a
win.
2015-03-02 06:27:13 -06:00
Ryan Houdek
120df4c688 [AArch64] Implement loadstore unscaled. 2015-02-16 22:00:43 -06:00
Ryan Houdek
814aaaf538 [AArch64] Implement a couple of emitter instructions.
These will be used with the vertex loader JIT recompiler.
2015-02-13 12:16:06 -06:00