Our defines were never clear between what meant 64bit or x86_64
This makes a clear cut between bitness and architecture.
This commit also has the side effect of bringing up aarch64 compiling support.
- remove unused variables
- reduce the scope where it makes sense
- correct limits (did you know that strcat()'s last parameter does not
include the \0 that is always added?)
- set some free()'d pointers to NULL
inline halfxb
So we know which is the first pixel by masking.
inline xl
inline xb a bit
inline yl
inline uv1.x shift
remove likely wrong guessed ternary operator
add pixel layout comment
inline xel
optimize the shifts a bit
inline xb
optimize shifts in a second step
extract xb
rename all variables
calculate cache line by position.x
Revert 5115b459f40d53044cd7a858f52e6e876e1211b4 "optimize the shifts a bit"
It seems I was wrong, the other way is the more natural.
use x_virtual_position instead of uv1.x for x_offset_in_block
This looks more natural and the offset should be masked anyway.
substitude factor with cache_lines
move 32bit logic in a conditional block
This option was known to break every second game and only boost a bit.
It also seems to be broken because of streaming into pinned memory and buffer storage buffers.
v2: also remove dlc_desc
Older Qualcomm drivers rotated the framebuffer 90 degrees and this fix didn't work.
Now for some obscene reason it rotates a full 180 degrees.
This can at least be worked around by flipping around the image on our end.
On Windows, nvidia don't give us their driver version, so we can't workaround any issues.
As buffer_storage is broken on some drivers, we wanted to disble it for them.
So we can't.
Luckyly only "some" released driver versions are affected as this extension is only available since some months. Let's hope that nobody have to use one of this driver version, else they will get a black screen ...
OSX has their own driver, so performance issues aren't shared with the nvidia driver (unlike the closed source linux and windows nvidia driver). So now they'll also use the MapAndSync backend like all other osx drivers.
fixes issue 6596
I've also cleaned up the if/else block selecting the best backend a bit.
The new chapter title in Paper Mario TTYD had a small graphical bug due to the new code because it read one extra pixel, this fixes it.
I hope this gets everything, I though I had checked most bugs and yet here I am, commit-spamming...
Fixes some remaining bbox related bugs in Mickey's Magical Mirror and a slight graphical glitch in Paper Mario: TTYD when flipping and Vivian as your companion (I've been scratching my head for days to find this one).
Instead of being vertex-based, it is now primitive (point, line or dissected triangle) based, with proper clipping.
Also, screen position is now calculated based on viewport values, instead of "guesstimating".
This fixes many graphical glitches in Paper Mario: TTYD and Super Paper Mario.
Also, the new code allows Mickey's Magical Mirror and Disney's Hide & Sneak to work (mostly) bug-free. I changed their inis to use bbox.
These changes have a slight cost in performance when bbox is being used (rare), mostly due to the new clipping algorithm.
Please check for any regressions or crashes.
This branch is the final step of fully supporting both OpenGL and OpenGL ES in the same binary.
This of course only applies to EGL and won't work for GLX/AGL/WGL since they don't really support GL ES.
The changes here actually aren't too terrible, basically change every #ifdef USE_GLES to a runtime check.
This adds a DetectMode() function to the EGL context backend.
EGL will iterate through each of the configs and check for GL, GLES3_KHR, and GLES2 bits
After that it'll change the mode from _DETECT to whichever one is the best supported.
After that point we'll just create a context with the mode that was detected
As we do lots of writes to *Iptr, the compiler isn't allowed to cache any shared variable (neither index nor Iptr itself).
This commit inlines Iptr + index into the index generator functions, so the compiler know that they are const.
We are used to render them out of order as long as everything else matches, but rendering order does matter, so we have to flush on primitive switch. This commit implements this flush.
Also as we flush on primitive switch, we don't have to create three different index buffers. All indices are now stored in one buffer.
This will slow down games which switch often primitive types (eg ztp), but it should be more accurate.
This "u32 components" is a list of flags which attributes of the vertex loader are present.
We are used to append this variable to lots of vertex generation functions, but some of them don't need it at all.
The usual way to handle this kind of request is to rise a flag which the gpu thread polls.
The gpu thread itself either generates the result or just write zeros if disabled.
After this, it rise another flag which says that this work is done.
So if disabled, we still have the cpu-gpu round trip time. This commit just returns 0 on the cpu thread
instead of playing ping pong...
Some information on this bug since this isn't quite true.
Seemingly with the v53 driver, Qualcomm has actually fixed this bug. So we can dynamically access UBO array members.
The issue that is cropping up is actually converting our attribute 'fposmtx' to an integer.
int posmtx = int(fpostmtx);
This line causes some seemingly garbage values to enter in to the posmtx variable.
Not sure exactly why it is failing, probably them just not actually converting the float to an integer and just handling the float directly as a integer.
So the bug is going to stay active with Qualcomm devices until we convert this vertex attribute from a float to a integer.