Commit Graph

28406 Commits

Author SHA1 Message Date
f25611f388 JitArm64: MultiplyImmediate - Handle 1
Multiplication by one is also trivial. Depending on the registers
involved, either a single MOV or no instructions will be generated.

Before:
0x52800038   mov    w24, #0x1
0x1b1a7f1b   mul    w27, w24, w26

After:
0x2a1a03fb   mov    w27, w26

Before:
0x52800039   mov    w25, #0x1
0x1b1a7f3a   mul    w26, w25, w26

After:
Nothing!
2022-11-01 21:13:45 +01:00
51cb918aa5 JitArm64: MultiplyImmediate - Handle 0
Multiplication by zero always gives zero.

Before:
0x52800019   mov    w25, #0x0
0x1b197f5b   mul    w27, w26, w25

After:
Nothing!
2022-11-01 21:13:38 +01:00
080513284c JitArm64: mullwx - Use MultiplyImmediate 2022-11-01 19:05:33 +01:00
53a8cd1563 JitArm64: mulli - Use MultiplyImmediate 2022-11-01 19:04:50 +01:00
4aa0c0133a JitArm64: Introduce MultiplyImmediate
Add a new function that will handle all the special cases regarding
multiplication. It does nothing for now, but will be expanded in
follow-up commits.
2022-11-01 19:01:38 +01:00
d0de68c41b JitArm64: cmp - Optimize general case
We can merge an SXTW with the SUB, eliminating one instruction. In
addition, it is no longer necessary to allocate a temporary register,
reducing register pressure.

Before:
0x93407f59   sxtw   x25, w26
0x93407ebb   sxtw   x27, w21
0xcb1b033b   sub    x27, x25, x27

After:
0x93407f5b   sxtw   x27, w26
0xcb35c37b   sub    x27, x27, w21, sxtw
2022-11-01 12:21:24 +01:00
ae6ce1df48 Arm64Emitter: Add ArithOption with ExtendSpecifier
ARM64 can do perform various types of sign and zero extension on a
register value before using it. The Arm64Emitter already had support for
this, but it was kinda hidden away.

This commit exposes the functionality by making the ExtendSpecifier enum
available everywhere and adding a new ArithOption constructor.
2022-11-01 12:15:56 +01:00
82f22cdfa1 JitArm64: cmp - Optimize a == -1 case
By explicitly handling this, we can avoid materializing -1 in a
register and generate more efficient code by taking advantage of -x ==
~x + 1.

Before:
0x12800015   mov    w21, #-0x1
0x93407eb9   sxtw   x25, w21
0x93407ef8   sxtw   x24, w23
0xcb180338   sub    x24, x25, x24

After:
0x2a3703f8   mvn    w24, w23
0x93407f18   sxtw   x24, w24
2022-11-01 12:00:32 +01:00
592ba31e22 JitArm64: cmp - Optimize a == 0 case
By explicitly handling this, we can avoid materializing zero in a
register and generate more efficient code altogether.

Before:
0x52800016   mov    w22, #0x0
0xb94093b5   ldr    w21, [x29, #0x90]
0x93407ed7   sxtw   x23, w22
0x93407eb9   sxtw   x25, w21
0xcb1902f9   sub    x25, x23, x25

After:
0xb94093b7   ldr    w23, [x29, #0x90]
0x4b1703f9   neg    w25, w23
0x93407f39   sxtw   x25, w25
2022-11-01 11:52:00 +01:00
f5e7e70cc5 JitArm64: cmp - Refactor 2022-11-01 11:47:17 +01:00
dbb8f588c7 JitArm64: cmpl - Optimize a == 0 case
By explicitly handling this, we can avoid materializing zero in a
register.

Before:
0x52800019   mov    w25, #0x0
0xb94087b6   ldr    w22, [x29, #0x84]
0xcb16033b   sub    x27, x25, x22

After:
0xb94087b9   ldr    w25, [x29, #0x84]
0xcb1903fb   neg    x27, x25
2022-11-01 11:27:45 +01:00
7cd08fde75 Updater: Add/clarify error messages 2022-10-31 23:36:07 -07:00
2808db7f2f FileUtil: Return success bool from CopyDir 2022-10-31 23:33:02 -07:00
111e965c7e Revert "MacUpdater: test that os version check is working" 2022-10-31 18:53:22 -07:00
b182abe0ae Merge pull request #11234 from shuffle2/updater
MacUpdater: test that os version check is working
2022-11-01 01:28:20 +00:00
22eb7e6645 OGL: use already known object label lengths
Passing -1 means the driver has to call strlen().
2022-11-01 01:10:03 +00:00
4b8fe959d4 OGL: fix compute shader labels
This fixes GL_INVALID_VALUE errors when using GPU texture decoding.
2022-11-01 01:04:46 +00:00
f5fecaf964 VideoBackends:Vulkan: Fix 0 size descriptor pools
[ VUID-VkDescriptorPoolCreateInfo-maxSets-00301 ] Object 0:
handle = 0x7f1,b8d,3cd,e70, type = VK_OBJECT_TYPE_DEVICE; |
MessageID = 0xa1,70e,236 | vkCreateDescriptorPool():
pCreateInfo->maxSets is not greater than 0.
The Vulkan spec states: maxSets must be greater than 0
2022-10-31 22:41:16 +01:00
7cc8e37aee MacUpdater: test that os version check is working
Adds a key to Info.plist with default value to test
Updater - this commit is intended to be reverted
2022-10-30 13:19:43 -07:00
969309c457 Merge pull request #11220 from shuffle2/macversion
MacUpdater: check os version
2022-10-30 15:19:55 -04:00
089886a6f8 MacUpdater: check os version 2022-10-30 12:04:57 -07:00
f277a921a9 Merge pull request #11231 from shuffle2/updater
windows: Rename: use std::filesystem::rename for posix behavior
2022-10-30 13:32:10 -04:00
950e1f94dc Merge pull request #11185 from TryTwo/PR_MemoryWidget_Address_Input_History
MemoryWidget: Make search address a combobox that holds address history.
2022-10-30 04:21:14 -04:00
053320b7cf MemoryWidget: Make search address a combobox that holds address history.
Always update the combobox when a new target address is sent.
2022-10-29 22:41:30 -07:00
68875dc06b MacUpdater: add version info to Updater.app too 2022-10-29 20:32:59 -07:00
836bc74b2d windows: Rename: use std::filesystem::rename for posix behavior 2022-10-29 19:57:26 -07:00
fe559f3ed3 VideoCommon/Statistics: Require semicolons after statistics macros
This is clearer and reduces IntelliSense problems.
2022-10-29 15:39:41 -07:00
0628794cb6 Merge pull request #11226 from K0bin/d3d12-fix
VideoBackends:D3D12: Fix hang in Twilight Princess
2022-10-30 00:24:16 +02:00
a07ee729e5 VideoBackends:D3D12: Defer binding framebuffer in SetAndDiscardFramebuffer
BindFramebuffer depends on the pipeline which might not be set yet.
That's why the framebuffer dirty flag exists in the first place.
I assume BindFramebuffer was called directly here, in order to handle
the texture state transitions necessary for DiscardResource.
The state is tracked anyway, so we can just issue those transitions there
too and defer binding the actual framebuffer.

Fixes an issue in Zelda Twilight Princess with EFB depth peeks.
Dolphin would bind a frame buffer which doesn't have an integer format
descriptor for the color target before binding the new pipeline.
So it would accidentally use the 0 descriptor.

Debug layer error:
D3D12 ERROR: ID3D12CommandList::OMSetRenderTargets:
Specified CPU descriptor handle ptr=0x0000000000000000 does not refer to
a location in a descriptor heap. pRenderTargetDescriptors[0] is the issue.
[ EXECUTION ERROR #646: INVALID_DESCRIPTOR_HANDLE]
2022-10-29 23:41:32 +02:00
a6aa651291 VideoBackends:D3D12: Use COMMON as initial state for default heap buffer
Fixes the following error in the D3D12 debug layer:

D3D12 WARNING: ID3D12Device::CreateCommittedResource:
Ignoring InitialState D3D12_RESOURCE_STATE_UNORDERED_ACCESS.
Buffers are effectively created in state D3D12_RESOURCE_STATE_COMMON.
[ STATE_CREATION WARNING #1328: CREATERESOURCE_STATE_IGNORED]
2022-10-29 23:39:32 +02:00
22fecb41fc VideoBackends:D3D12: Don't query GPU descriptor handle for non-shader visible heap
Fixes the following error in the D3D12 debug layer:

D3D12 ERROR: ID3D12DescriptorHeap::GetGPUDescriptorHandleForHeapStart:
GetGPUDescriptorHandleForHeapStart is invalid to call on a descriptor
heap that does not have DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE set.
If the heap is not supposed to be shader visible, then
GetCPUDescriptorHandleForHeapStart would be the appropriate method
to call. That call is valid both for shader visible and non shader
visible descriptor heaps.
[ STATE_GETTING ERROR #1315: DESCRIPTOR_HEAP_NOT_SHADER_VISIBLE]
2022-10-29 23:39:27 +02:00
cacdd18ca0 VolumeVerifier: fix bogus "serial/version missing" error
When searching for a disc where the revision doesn't match any disc in
the datfile, the loop would never get to the part where serials_exist is
set to true, leading to a bogus error message.
2022-10-29 21:32:57 +01:00
6dcf8a6fc9 Merge pull request #11201 from JoshuaMKW/fix-instruction-patches
MemoryPatches: Fix instruction patches
2022-10-29 12:34:00 -07:00
431ee1c48a JitArm64: Improve register handling for MMIO loadstores
Because of the previous commit, `regs_in_use` must not include `dest_reg`
when calling MMIOLoadToReg. There are also some other registers we can
skip including in regs_in_use just for efficiency's sake.

The `addr_reg_set = false` statements that I've added in this commit are
technically redundant – if `mmio_address` is non-zero then `addr_reg_set`
is already false – but it's just a coincidence that that's the case.
2022-10-29 14:16:53 +02:00
0660f12da4 JitArm64: Move MMIO handler result before popping stack
Otherwise we might throw the result away.
Fixes https://bugs.dolphin-emu.org/issues/13083.
2022-10-29 14:16:43 +02:00
ea3e133200 VideoCommon: call texture load graphics mod hook when Dolphin loads a texture 2022-10-28 19:24:43 -05:00
0e1ffe009a VideoBackends: fix d3d12 subresource calculation 2022-10-28 19:07:08 -05:00
8efd7833e5 Merge pull request #11150 from jordan-woyak/all-devices-less-confusing
DolphinQt: Make "All Devices" mapping hopefully less confusing.
2022-10-29 00:53:19 +02:00
8001535d12 Merge pull request #11211 from jordan-woyak/fix-focus-resume-after-manual-pause
DolphinQt: Fix window focus from unpausing after a manual pause.
2022-10-29 00:35:50 +02:00
e2f4400f49 Make SetPatch responsible for overwriting old patches 2022-10-26 22:46:49 -05:00
2f3805e1b4 GraphicsSettings: Remove unused FreelookControlType enum forward declaration 2022-10-26 16:23:13 -07:00
4fc05dd025 DolphinQt: Fix window focus from unpausing after a manual pause. 2022-10-25 19:39:41 -05:00
581a575042 VertexLoader: Remove "too many initializer values" workaround functions
I originally added these in 2b1d1038a6, for both the TPipelineFunction and the size. The size was moved into the header in fdcd2b7d00 (making the size functions obsolete), but it seems that the functions themselves are no longer needed now.

I think I didn't use this approach before because it would have required ComponentFormatTable and ComponentCountRow to be templated, which would end up resulting in lines that were too long and thus wrapped in awkward places. (I *think* they didn't get inferred properly.) Now that we only need TPipelineFunction, the templating is not needed, and this ends up being a more readable version of the version with the wrapper functions.
2022-10-25 15:29:09 -07:00
027e10460a Merge pull request #10977 from tellowkrinkle/FixBackendMultithreading
VideoBackends:Vulkan: Improve backend multithreading
2022-10-25 04:14:01 -04:00
9ef7a3b44c Merge pull request #11207 from Pokechu22/invalid-normal-count
VideoCommon: Treat invalid normal count as NormalTangentBinormal
2022-10-25 03:17:19 -04:00
574939b683 VideoCommon: Treat invalid normal count as NormalTangentBinormal
See https://bugs.dolphin-emu.org/issues/13070.
2022-10-24 22:36:43 -07:00
b66793194e Merge pull request #11028 from tellowkrinkle/MetalFixes
Various Metal renderer improvements
2022-10-24 15:22:37 -04:00
4787b25a7f Merge pull request #10741 from Pokechu22/audio-dma-one-block-at-a-time
DSP: Copy audio dma samples one block at a time
2022-10-24 01:43:22 -04:00
2594447c25 Have UnsetPatch only unset the argument address 2022-10-23 18:42:34 -05:00
e10b3308c2 Fix patch corruption using find_if instead of remove_if 2022-10-23 18:41:15 -05:00