Commit Graph

6280 Commits

Author SHA1 Message Date
a09b9bef8d Merge pull request #2952 from lioncash/constexpr
CommonFuncs: Replace ArraySize define with constexpr equivalent
2015-09-03 22:56:25 -07:00
3f1b488a12 CommonFuncs: Replace ArraySize define with constexpr equivalent 2015-09-03 23:47:14 -04:00
8dd80b8e97 Merge pull request #2943 from booto/vi-enb
VI: Respect DisplayControlRegister ENB bit
2015-09-04 03:50:39 +02:00
4fd060ba11 Core: Use constexpr for default pad and attachment radius 2015-09-03 19:44:42 -04:00
aa7208e270 [windows] Update projects to vs2015. 2015-09-03 04:23:01 -07:00
a1538a30ef Merge pull request #2941 from lioncash/gp
GPFifo: Remove pointer casts
2015-09-03 13:47:26 +12:00
2d224bd3b1 ActionReplay: Remove an alloca call 2015-09-02 17:41:19 -04:00
5797111ef0 JitArm64: Optimize fpr.R() 2015-09-02 22:46:14 +02:00
dfd44730c8 JitArm64: simplify fpr call 2015-09-02 22:46:14 +02:00
28d788ba2c VI: Respect DisplayControlRegister ENB bit
When ENB is set to 0 (default), VI should not
generate clocks, and so shouldn't generate
output.
2015-09-03 04:13:32 +08:00
f32b79e612 GPFifo: Get rid of pointer casts 2015-09-02 15:24:33 -04:00
db98efdc98 GPFifo: Adjust parameter names 2015-09-02 15:20:02 -04:00
ecbb83fa0f Merge pull request #2686 from booto/field-timing
VI: derive field timing from VI registers
2015-09-03 01:09:43 +12:00
3b134497dd Merge pull request #2774 from AdmiralCurtiss/wiimote-extension-reconnect-on-button-press
Wiimote: Extend emulated Wiimote reconnect-on-button-press to attachments.
2015-09-01 18:31:39 +02:00
f6e4a8e680 FifoPlayer: Use VI derived timing, not hardcoded 60Hz 2015-09-01 20:24:42 +08:00
8d6c39a89d VI: Adjust forced-progressive hack per magumagu's suggestion 2015-09-01 20:24:41 +08:00
acc9a74174 VI: Restore forced-progressive hack with option
Bugfix: TargetRefreshRate uses rounded result
NTSC's 59.94 was becoming 59 with integer division.
2015-09-01 20:24:40 +08:00
480dbb22f2 VI: derive field timing from VI registers 2015-09-01 20:24:40 +08:00
ae0a06a018 [AArch64] Implement dcbz instruction 2015-08-31 15:39:47 -05:00
0f54aa48b4 Merge pull request #2928 from Sonicadvance1/aarch64_improved_singles
[AArch64] Improve floating point single instructions.
2015-08-31 12:00:08 -05:00
bcde1aa8ff [AArch64] Improve floating point single instructions.
Instead of having an "INS" instruction after every single instruction to duplicate the bottom 64bits in to the top 64bits of the register,
create a new FPR register cache type to track when a register's lower 64bits is supposed to be duplicated in to the high 64bits.
Not necessarily actually having the lower bits duplicated in the host side register. This removes inefficient INS instructions from sequential single
float instructions.
In particular a very heavy single heavy block in Animal Crossing went from 712 instructions down to 520 instructions(~37% less instructions!)
2015-08-31 11:09:17 -05:00
d003934b8a Merge pull request #2929 from Sonicadvance1/aarch64_optimize_gpr_flush
Aarch64 optimize gpr flush
2015-08-31 10:55:45 -05:00
8bf332cf08 [AArch64] Optimize GPR cache flushing.
If we are flushing multiple sequential guest GPRs then we can store two in a single STP instruction.
Ikaruga does this quite a bit in their blocks where they do an lmw at the very end and then we have to flush them all.
Typically cuts 16 STR instructions down to 8 STP instructions there.
2015-08-30 23:07:12 -05:00
368867dba0 Merge pull request #2922 from aserna3/SDBlock
Implemented ability to block writes to the SD card
2015-08-31 04:51:50 +12:00
b907576510 [AArch64] Support profiling by cycle counters if they are available to EL0 2015-08-30 10:25:16 -05:00
5110574c1f Merge pull request #2921 from Sonicadvance1/aarch64_optimize_lmw
[AArch64] Optimize lmw.
2015-08-30 10:23:57 -05:00
df19f11cb9 Jit_Util: Add missing override specifiers 2015-08-29 00:30:18 -04:00
db7fe9507e Implemented ability to block writes to the SD card
Renamed variable to be more accurate
2015-08-28 17:32:29 -07:00
8d61706440 [AArch64] Optimize lmw.
This instruction is fairly heavily used by Ikaruga to load a bunch of registers from the stack.
In particular at the start of the second stage is a block that takes up ~20% CPU time that includes a usage of lmw to load half of the guest
registers.

Basic thing optimized here is changing from a single 32bit LDR to potentially a single 128bit LDR.
a single 32bit LDR is fairly slow, so we can optimize a few ways.
If we have four or more registers to load, do a 64bit LDP in to two host registers, byteswap, and then move the high 32bits of the host registers in
to the correct mapped guest register locations.
If we have two registers to load then do a 32bit LDP which will load two guest registers in a single instruction.
and then if we have only one register left to load, load it as before.

This saves quite a bit of cycles since the Cortex-A57 and A72's LDR instruction takes a few cycles.

Each 32bit LDR takes 4 cycles latency, plus 1 cycle for post-index(which typically happens in parallel.
Both the 32bit and 64bit LDP take the same amount of latency.

So we are improving latencies and reducing code bloat here.
2015-08-28 14:40:30 -05:00
2c3fa8da28 [AArch64] Fix a bug in the register caches.
This is a bug that crops if BindToRegister() is called multiple times in a row without a R() function call between them.
How to reproduce the bug:
1) Have a completely filled cache with no host register remaining
2) Call BindToRegister() with different guest registers
3) Don't call R() between the BindToRegister() calls.

This issue typically wouldn't be seen for a couple of reasons. Typically we have /plenty/ of registers in the cache, and in most cases we only call
BindToRegister() once per instruction. In the off chance that it is called multiple times, it wouldn't update the last used counts and would flush the
same register as the previous call to it.
2015-08-28 14:36:14 -05:00
d86d5fae9f Merge pull request #2909 from aserna3/DollsAndElves
Implemented .elf and .dol support in gamelist
2015-08-28 14:28:09 -04:00
faedf1bc5c Implemented .elf and .dol support in gamelist
Fixed a TON of structuring, formatting.

removed README.txt files from themes at MaJoR's request

Added platform icon for ELFs/DOLs
2015-08-28 11:10:03 -07:00
e516d4ef59 JitArm64: Implement rlwnmx 2015-08-26 21:59:10 +02:00
99e88a7af7 Merge pull request #2887 from Tilka/swap
Jit64: some byte-swapping changes
2015-08-26 16:43:45 +02:00
eb6ac641be Merge pull request #2906 from Tilka/fpscr
Jit64: fix bugs in the FPSCR instructions
2015-08-26 16:43:28 +02:00
6ec4bdf862 CoreTiming: remove unused functions 2015-08-26 15:40:15 +02:00
0f4861cac2 CoreTiming: make loops easier to read 2015-08-26 14:53:58 +02:00
ca51f1a4f6 [AArch64] Optimize paired registers being used in double operations.
In particular this optimizes the case where a 32bit float is loaded via lfs, and then used in double operations.
This happens very often in Gekko based code because the best way to load a 32bit value as a double is lfs since it automatically turns in to a double value.

There are a few other implications of this in practice as well. Like if both of the paired registers are loaded via psq_l and then used in double
operations it would be improved.
Also if we implement a double register we've got to be careful to make sure we understand if it is in "lower" register or the full 128bit register.
2015-08-26 05:50:04 -05:00
5716d18d10 Merge pull request #2910 from Sonicadvance1/aarch64_regcache_fix
[AArch64] Fix a bug in the register cache.
2015-08-26 08:31:24 +02:00
4f5f29a0fb [AArch64] Fix a bug in the register cache.
If the register was only a lower pair and it needed the full register, then we need to load the high 64bits.
Which we weren't doing before.
2015-08-26 01:21:43 -05:00
43d17cb360 Merge pull request #2904 from Sonicadvance1/aarch64_more_inst
[AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
2015-08-26 07:48:24 +02:00
ee4a12ffe2 Jit64: some byte-swapping changes 2015-08-26 05:41:18 +02:00
6729a36d8d [AArch64] Set BindToRegister's to_load correctly for double FP ops. 2015-08-25 21:29:27 -05:00
db4f692482 GCMemcard: Clean up memcard logging messages. 2015-08-25 21:55:52 -04:00
ee50a2ef28 Jit64: fix bugs in the FPSCR instructions 2015-08-25 23:48:14 +02:00
bd08c1b01a Merge pull request #2901 from Sonicadvance1/aarch64_stfiwx
[AArch64] Implement stfiwx
2015-08-25 22:47:39 +02:00
24cb650078 Merge pull request #2663 from degasus/dcbx
Jit64: dcbf + dcbi
2015-08-25 12:16:56 +02:00
0666c0750b [AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
Gets the povray bench to better times than the Wii.
2015-08-24 15:32:19 -05:00
d96be9250c Merge pull request #2899 from Sonicadvance1/aarch64_fctiwzx
[AArch64] Implement fctiwzx
2015-08-24 13:22:27 -05:00
0d92c8fb89 Jit64: Optimize dcbx 2015-08-24 18:33:23 +02:00