dolphin

mirror of https://github.com/dolphin-emu/dolphin.git synced 2025-09-23 11:52:42 -06:00

Author	SHA1	Message	Date
degasus	c375111076	Options: merge SCoreStartupParameter into SConfig	2015-06-12 19:07:45 +02:00
Tillmann Karras	30ebb2459e	Set copyright year to when a file was created	2015-05-25 13:22:31 +02:00
Tillmann Karras	cefcb0ace9	Update license headers to GPLv2+	2015-05-25 13:22:31 +02:00
Lioncash	a11bbe6fea	PPCTables: Remove FL_OUT_S. This is unused, and since it had the same value as FL_OUT_D, it was unnecessarily setting the rS register as an output, even on instructions that only have FL_OUT_D set.	2015-03-03 16:23:28 -05:00
Lioncash	7c244766dc	Interpreter: Use correct destination for psq_l, psq_lx, psq_lu, and psq_lux. Gekko manual defines these as storing to rD, not rS. Also removed FL_OUT_FLOAT_S, since nothing uses it now.	2015-02-21 21:20:41 -05:00
magumagu	985087c7e2	Make all unknown opcodes behave consistently. Consistently fall back to the interpreter for unknown instructions, and make sure GetOpInfo() always returns a non-null pointer.	2015-02-11 22:18:33 -08:00
magumagu	e136c8a066	PowerPC: misc cleanup.	2015-02-11 13:56:36 -08:00
magumagu	ac54c6a4e2	Make address translation respect the CPU translation mode. The PowerPC CPU has bits in MSR (DR and IR) which control whether addresses are translated. We should respect these instead of mixing physical addresses and translated addresses into the same address space. This is mostly mass-renaming calls to memory accesses APIs from places which expect address translation to use a different version from those which do not expect address translation. This does very little on its own, but it's the first step to a correct BAT implementation.	2015-02-11 13:56:22 -08:00
magumagu	0f96a0104e	Merge pull request #1752 from Buddybenj/clean-up Clean Up	2015-02-10 11:39:14 -08:00
Lioncash	e07679114b	Use emplace_* functions where in-place construction is preferable	2015-02-04 11:39:08 -05:00
Benjamin Przybocki	4f324ad742	Clean Up	2015-01-24 17:10:21 -06:00
CarlKenner	0ab1517134	Skip zeroes that sometimes pad function to 16 byte boundary (eg. Donkey Kong Country Returns). This fixes function detection in the debugger, and prevents functions showing up as four bytes inside another function.	2015-01-22 02:00:35 +10:30
Fiora	e8cfcd3aeb	JIT: make instruction merging generic Now it should be easier to merge more than 2-instruction-long sequences. Also correct some minor inconsistencies in behavior between instruction merging cases.	2015-01-11 09:11:18 -08:00
Fiora	8237004448	JIT: optimize for the common case of unquantized psq_l/st Optimistically assume used GQRs are 0 in blocks that only use one GQR, and bail at the start of the block and recompile if that assumption fails. Many games use almost entirely unquantized stores (e.g. Rebel Strike, Sonic Colors), so this will likely be a big performance improvement across the board for games with heavy use of paired singles.	2015-01-10 14:14:43 -08:00
Dolphin Bot	89b7f1057f	Merge pull request #1804 from FioraAeterna/fastermmu2_master MMU: various improvements, bugfixes, optimizations	2015-01-07 00:49:58 +01:00
Fiora	8f7c799794	JIT: catch illegal instruction errors Still crash, but at least give a message informing the world that it happened.	2015-01-06 11:06:49 -08:00
Fiora	e85f0ff179	MMU: fix problems with blocks that cross vmem page boundaries In rare cases, this can result in a violation of the JIT block cache constraint that blocks must end in the same place. This can cause instability, lockups, due to blocks not properly being invalidated properly. l Please enter the commit message for your changes. Lines starting	2015-01-05 10:46:04 -08:00
Fiora	c2ed29fe0d	MemmapFunctions: various MMU optimizations Small TLB lookup optimizations: this is the hot path for MMU code, so try to make it better. Template the TLB lookup functions based on the lookup type (opcode, data, no exception). Clean up the Read/Write functions and make them more consistent. Add an early-exit path for MMU accesses to ReadFromHardware/WriteToHardware.	2015-01-05 10:34:55 -08:00
Fiora	e7a49ae5f3	Eliminate some spammy log messages in MMU mode dcbz: just don't use GetPointer, that can't be right anyways ppcanalyst: don't print "instruction hex 0" messages in MMU mode, where ISIs are expected.	2014-12-21 12:41:44 -08:00
skidau	693f413364	Updated C bit on TLB cache hits. Added TLB state to the save state file.	2014-12-05 14:29:13 +11:00
Ryan Houdek	08660c89ad	Fix register usage detection in PPCAnalyst. lmw/stmw weren't properly setting input and output registers since they use multiple registers. dcbz was just missing a flag in the instruction tables.	2014-12-02 16:12:33 -06:00
Fiora	72c96c20d3	JIT: more optimizing of float ops based on known input characteristics If the inputs are both float singles, and the top half is known to be identical to the bottom half, we can use packed arithmetic instead of scalar to skip the movddup. This is slower on a few rather old CPUs, plus the Atom+Silvermont, so detect Atom and disable it in that case. Also avoid PPC_FP on stores if we know that the output came from a float op.	2014-11-29 11:33:11 -08:00
Fiora	7df50b0710	JIT: skip weird fmul rounding if the input is known to be single precision	2014-11-29 11:30:51 -08:00
Fiora	97fba41860	JIT: merge fcmpx and cror Almost all uses of boolean condition-register ops in real code seem to be the combination fcmpx + cror (e.g. for <= or >=). This merges the two.	2014-10-29 00:30:27 -07:00
comex	b6a7438053	Add BitSet and, as a test, convert some JitRegCache stuff to it. This is a higher level, more concise wrapper for bitsets which supports efficiently counting and iterating over set bits. It's similar to std::bitset, but the latter does not support efficient iteration (and at least in libc++, the count algorithm is subpar, not that it really matters). The converted uses include both bitsets and, notably, considerably less efficient regular arrays (for in/out registers in PPCAnalyst). Unfortunately, this may slightly pessimize unoptimized builds.	2014-10-25 16:56:51 -04:00
Fiora	7ba9a8537b	JIT: add basic register allocation heuristics Should be at least a bit better than the previous LRU approach. Currently has two basic components: whether a register is dirty (dirty registers need to be stored, so clobbering them hurts more) and how many other registers will be used between now and the next time a register gets used. Also don't pre-load values that don't need to be in registers.	2014-10-09 20:09:14 -07:00
Fiora	8fe730194b	JIT: load registers if they're going to be used later in the block	2014-10-03 11:58:04 -07:00
skidau	4b37fdfa45	Added a CompileExceptionCheck function to the JitInterface and re-routed the existing code to utilise the interface.	2014-09-27 20:16:26 +10:00
Fiora	bfab5f1e91	JIT: generic branch merging Why merge just cmps and rlwinm when we can merge ALL the branches?	2014-09-24 12:34:18 -07:00
Fiora	f103234e2b	JIT: flush a register if it won't be used for the rest of the block This should dramatically reduce code size in the case of blocks with lots of branches, and certainly doesn't hurt elsewhere either. This can probably be improved a good bit through smarter tracking of register usage, e.g. discarding registers that are going to be overwritten, but this is a good start and should help reduce code size and register pressure. Unlike that sort of change, this is a "safe" patch; it only flushes registers, which can't affect correctness, unlike actually discarding data. As part of this, refactor PPCAnalyst to support distinguishing between float and integer registers (to properly handle instructions that access both, like floating-point loads and stores). Also update every instruction in the interpreter flags table I could find that didn't have all the correct flags.	2014-09-22 16:00:25 -07:00
Fiora	08ac10d00a	PPCAnalyst/JIT: add ability to easily toggle branch and carry merging	2014-09-13 13:48:24 -07:00
Fiora	54129a8ca5	PPCAnalyst: refactor, add carry op reordering and non-cmp reordering Tries as hard as possible to push carry-using operations (like addc and adde) next to each other. Refactor the instruction reordering to be more flexible and allow multiple passes. 353 -> 192 x86 instructions on a carry-heavy code block in Pokemon Puzzle. 12% faster overall in Pokemon Puzzle; probably less in typical games (Virtual Console games seem to be carry-heavy for some reason; maybe a different compiler?)	2014-09-13 13:48:23 -07:00
Fiora	45d84605a9	JIT64: optimize carry calculations further Keep carry flags in the x86 flags register if used in the next instruction.	2014-09-13 13:48:20 -07:00
Fiora	bea2504a51	JIT64: optimize carry calculations Omit carry calculations that get overwritten later in the block before they're used. Very common in the case of srawix and friends.	2014-09-13 13:47:43 -07:00
Ryan Houdek	b8d4834cb1	Fix the return value of PPCAnalyst. In situations where conditional continue isn't supported + if a JIT doesn't implement a instruction that has the FL_ENDBLOCK flag. This would cause an infinite loop. In reality all the JITs should implement every FL_ENDBLOCK instruction regardless, but JITIL doesn't implement tw/twi which are FL_ENDBLOCK instructions.	2014-09-10 21:33:17 -05:00
Rachel Bryk	f93aa7087c	Kill Core::g_CoreStartupParameter.	2014-09-09 00:24:49 -04:00
Fiora	07e0c917c6	Revert "JIT64: optimize CA calculations"	2014-09-05 10:26:30 -07:00
Fiora	3aa40dab00	JIT64: optimize carry calculations Omit carry calculations that get overwritten later in the block before they're used. Very common in the case of srawix and friends.	2014-09-01 20:41:48 -07:00
Lioncash	beb95b75ca	PPCAnalyst: Use std::swap instead of making a temporary variable	2014-08-30 18:32:09 -04:00
Lioncash	eb535be874	Core: Clean up brace placements	2014-08-30 18:06:49 -04:00
Fiora	7dbc623dc0	JIT: Initial FPRF support Doesn't support all the FPSCR flags, just the FPRF ones. Add PPCAnalyzer support to remove unnecessary FPRF calculations. POV-ray benchmark with enableFPRF forced on for an extreme comparison: Before: 1500s After, fmul/fmadd only: 728s After, all float: 753s In real games that use FPRF, like F-Zero GX, FPRF previously cost a few percent of total runtime. Since FPRF is so much faster now, if enableFPRF is set, just do it for every float instruction, not just fmul/fmadd like before. I don't know if this will fix any games, but there's little good reason not to.	2014-08-26 10:57:03 -07:00
Fiora	5c0145f71b	PPCAnalyzer: move num_instructions initialization to correct place Much of the PPC Analyzer code (e.g. instruction reordering for merging branches) wasn't actually being run.	2014-08-21 11:19:23 -07:00
Lioncash	4759510f70	Get rid of instances of "using namespace std;" in the project	2014-08-17 02:05:33 -04:00
Lioncash	a46a500b94	Core: Fix warnings on Linux related to the JIT	2014-08-02 16:15:20 -04:00
degasus	22e1aa5bb4	mark all local functions as static	2014-07-11 16:07:23 +02:00
Ryan Houdek	5147e721ae	Revert "PPCAnalyst now detects internal branches better" This reverts commit `31ec57ab81`.	2014-06-10 04:58:56 -05:00
Ryan Houdek	db08f7bf4a	Merge pull request #371 from quarnster/patch-1 PPCAnalyst now detects internal branches better	2014-06-06 02:45:24 -05:00
Fredrik Ehnbom	31ec57ab81	PPCAnalyst now detects internal branches better For example: ``` addr opcode disasm 80026584 48000054 b ->0x800265D8 ```	2014-05-15 16:36:44 +02:00
Ryan Houdek	cdec575bef	Fixes games that use the MMU to page in code(Rogue Leader). The issue was that on memory exception we wouldn't call in to PPCAnalyst and our code_block would retain the previous blocks information. This would cause us to compile the previous blocks instructions in prior to the exception exit.	2014-05-09 09:10:45 -05:00
Ryan Houdek	8e1dfef14c	Remove the old PPAnalyst::Flatten function that is no longer in use.	2014-04-30 10:49:39 -05:00

1 2

57 Commits