* nothing works yet
* don't double buffer 3D framebuffers for the GL Renderer
looks like leftovers from when 3D+2D composition was done in the frontend
* oops
* it works!
* implement display capture for compute renderer
it's actually just all stolen from the regular OpenGL renderer
* fix bad indirect call
* handle cleanup properly
* add hires rendering to the compute shader renderer
* fix UB
also misc changes to use more unsigned multiplication
also fix framebuffer resize
* correct edge filling behaviour when AA is disabled
* fix full color textures
* fix edge marking (polygon id is 6-bit not 5)
also make the code a bit nicer
* take all edge cases into account for XMin/XMax calculation
* use hires coordinate again
* stop using fixed size buffers based on scale factor in shaders
this makes shader compile times tolerable on Wintel
- beginning of the shader cache
- increase size of tile idx in workdesc to 20 bits
* apparently & is not defined on bvec4
why does this even compile on Intel and Nvidia?
* put the texture cache into it's own file
* add compute shader renderer properly to the GUI
also add option to toggle using high resolution vertex coordinates
* unbind sampler object in compute shader renderer
* fix GetRangedBitMask for 64 bit aligned 64 bits
pretty embarassing
* convert NonStupidBitfield.h back to LF only new lines
* actually adapt to latest changes
* fix stupid merge
* actually make compute shader renderer work with newest changes
* show progress on shader compilation
* remove merge leftover
* First crack at ensuring the render thread doesn't touch GPU state while it's being serialized
* Get rid of the semaphore wait
* Add some extra fields into GPU3D's serialization
* Oops, TempVertexBuffer is already serialized
* Move vertex serialization into its own method
* Lock the GPU3D state when rendering on the render thread or serializing it
* Revert "Lock the GPU3D state when rendering on the render thread or serializing it"
This reverts commit 2f49a551c1.
* Add comments that describe the synchronization within GPU3D_Soft
- I need to understand it before I can solve my actual problem
- Now I do
* Revert "Revert "Lock the GPU3D state when rendering on the render thread or serializing it""
This reverts commit 1977566a6d.
* Let's try locking the GPU3D state throughout NDS::RunFrame
- Just to see what happens
* Slim down the lock's scope
* Narrow the lock's scope some more
* Remove the lock entirely
* Try protecting the GPU3D state with just a mutex
- I'll clean this up once I know it works
* Remove a duplicate method definition
* Add a missing `noexcept` specifier
* Remove an unused function
* Cut some non-hardware state from `GPU3D`'s savestate
* Assume that the next frame after loading a savestate won't be identical
* Actually, it _is_ worth it
* Don't serialize the clip matrix
- It's recalculated anyway
* Serialize `RenderPolygonRAM` as an array of indexes
* Clean up some comments
- I liked the dialogue style, but oh well
* Try restarting the render thread instead of using the lock
- Let's see what happens
* Put the lock back
* Fix some polygon and vertex indexes being saved incorrectly
- Taking the difference between two pointers results in the number of elements, not the number of bytes
* Remove `SoftRenderer::StateBusy` since it turns out we don't need it
- The real synchronization was the friends we made along the way
* Allow 3D renderers to be created without passing `GPU` to the constructor
* Make the initial 3D renderer configurable via `NDSArgs`
* Fix a compiler error
* Sprinkle `const` around where appropriate
- This will make it easier to use `NDS` objects in `const` contexts (e.g. `const` parameters or methods)
* Remove the `const` qualifier on `DSi_DSP::DSPRead16`
- MMIO reads can be non-pure, so this may not be `const` in the future
* Give `GPU2D::Unit` a virtual destructor
- Undefined behavior avoided!
* Add an `array2d` alias
* Move various parts of `GPU2D::SoftRenderer` to `constexpr`
- `SoftRenderer::MosaicTable` is now initialized at compile-time
- Most of the `SoftRenderer::Color*` functions are now `constexpr`
- The aforementioned functions are used with a constant value in at least one place, so they'll be at least partially computed at compile-time
* Generalize `GLRenderer::PrepareCaptureFrame`
- Declare it in the base `Renderer3D` class, but make it empty
* Remove unneeded `virtual` specifiers
* Store `Framebuffer`'s memory in `unique_ptr`s
- Reduce the risk of leaks this way
* Clean up how `GLCompositor` is initialized
- Return it as an `std::optional` instead of a `std::unique_ptr`
- Make `GLCompositor` movable
- Replace `GLCompositor`'s plain arrays with `std::array` to simplify moving
* Pass `GPU` to `GLCompositor`'s important functions instead of passing it to the constructor
* Move `GLCompositor` to be a field within `GLRenderer`
- Some methods were moved up and made `virtual`
* Fix some linker errors
* Set the renderer in the frontend
* Remove unneeded `virtual` specifiers
* Remove `RenderSettings` in favor of just exposing the relevant member variables
* Update the frontend to accommodate the core changes
* Add `constexpr` and `const` to places in the interpolator
* Qualify references to `size_t`
* Construct the `optional` directly instead of using `make_optional`
- It makes the Linux build choke
- I think it's because `GLCompositor`'s constructor is `private`
* Reorganize namespaces
- Most types are now moved into the `melonDS` namespace
- Only good chance to do this for a while, since a big refactor is next
* Fix the build
* Refactor GPU3D to be an object
- Who has two thumbs and is the sworn enemy of global state? This guy!
* Refactor GPU itself to be an object
- Wow, it's used in a lot of places
- Also introduce a new `Melon` namespace for a few classes
- I expect other classes will be moved into `Melon` over time
* Change signature of Renderer3D::SetRenderSettings
- Make it noexcept, and its argument const
* Remove some stray whitespace
* Make cleanup a little more robust to mitigate undefined behavior
- Add some null checks before cleaning up the GPU3D renderer
- Make sure that all deleted objects are null
- Move cleanup logic out of an assert call
- Note that deleting a null pointer is a no-op, so there's no need to check for null beforehand
- Use RAII for GLCompositor instead of Init/DeInit methods
* Replace a DeInit call that I missed
* Make ARMJIT_Memory less likely to generate errors
- Set FastMem7/9Start to nullptr at the end
- Only close and unmap the file if it's initialized
* Make Renderer3D manage its resources with RAII
* Don't try to deallocate frontend resources that aren't loaded
* Make ARMJIT_Memory::DeInit more robust on the Switch
* Reset MemoryFile on Windows to INVALID_HANDLE_VALUE, not nullptr
- There is a difference
* Don't explicitly store a Valid state in GLCompositor or the 3D renderers
- Instead, create them with static methods while making the actual constructors private
* Make initialization of OpenGL resources fail if OpenGL isn't loaded
* assert that OpenGL is loaded instead of returning failure
* fix edge fill rules for swapped polygons
also fixes translucent polygons not being always edge filled.
* fix right edge fill rule
* fix right edge fill rule for realsies
* fix a few more glitchy polygons
specifically quads similar to: (-67,40) (64, 160) (192, 160), (8, 111)
* fix one edge case pixel
i hate this so much
* fix "flat bottom" edge fill
* fix regression + apply changes to shadow masks
fix a regression with certain line polygons not rendering; there seems to be an exception made by the ds' gpu in order for these polygons to render properly.
also apply these changes to shadow masks because i forgot to
* forgot to remove a line
---------
Co-authored-by: Arisotura <thetotalworm@gmail.com>
* fix polygons being swapped incorrectly
"borrowed" this from noods
needs verification that the >= and <= signs aren't actually supposed to be > and <
* proper rules for moving vertical right slopes left
* nvm most of that was actually pointless
that's on me for not checking
* fix aa being upside down on swapped y-major slopes
* further improvements to swapped aa
in addition to fixing swapped y-major slope aa, now fixes:
swapped x-major slope aa
swapped vertical slope aa
* use templates instead + style/comment tweaks
should force the compiler to precompile if statements like i want it to do, instead of just hoping it does so on its own
* Anti-Alias All Edges
Changing a bunch of 0x3s to 0xF since I figure if they're checking the left and right edge they wanna be checking the top and bottom too now that they're gonna be aa'd. also copy that if statement over since otherwise there won't be anything to blend with.
* small optimization
its probably a tiny bit faster?
idk id need actual benchmarking tools.
doesn't break anything at least.
* Draft GPU3D renderer modularization
* Update sources C++ standard to C++17
The top-level `CMakeLists.txt` is already using the C++17 standard.
* Move GLCompositor into class type
Some other misc fixes to push towards better modularity
* Make renderer-implementation types move-only
These types are going to be holding onto handles
of GPU-side resources and shouldn't ever be copied around.
* Fix OSX: Remove 'register' storage class specifier
`register` has been removed in C++17...
But this keyword hasn't done anything in years anyways.
OSX builds consider this "warning" an error and it
stops the whole build.
* Add RestartFrame to Renderer3D interface
* Move Accelerated property to Renderer3D interface
There are points in the code base where we do:
`renderer != 0` to know if we are feeding
an openGL renderer. Rather than that we can instead just have this be
a property of the renderer itself.
With this pattern a renderer can just say how it wants its data to come
in rather than have everyone know that they're talking to an OpenGL
renderer.
* Remove Accelerated flag from GPU
* Move 2D_Soft interface in separate header
Also make the current 2D engine an "owned" unique_ptr.
* Update alignment attribute to standard alignas
Uses standardized `alignas` rather than compiler-specific
attributes.
https://en.cppreference.com/w/cpp/language/alignas
* Fix Clang: alignas specifier
Alignment must be specified before the array to align the entire array.
https://en.cppreference.com/w/cpp/language/alignas
* Converted Renderer3D Accelerated to variable
This flag is checked a lot during scanline rasterization. So rather
than having an expensive vtable-lookup call during mainline rendering
code, it is now a public constant bool type that is written to only once
during Renderer3D initialization.
Squashed commit of the following:
commit b463a05d4b909372f0cd1ad91caa0c77a25e5901
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Mon Nov 30 01:55:35 2020 +0100
minor fix
commit ce73cebbdf5da243d7ebade82d8799ded9cd6b28
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Mon Nov 30 00:43:08 2020 +0100
fix dirty flags of BG/OBJ mappings not being reset
commit fc5d73a6178e3adc444398bdd23de8314b5ca8f8
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Mon Nov 30 00:11:13 2020 +0100
use flat vram for gpu2d everywhere
commit 34ee9fe2bf04fcfa2a5a1c8d78d70007e606f1a2
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Sat Nov 28 19:10:34 2020 +0100
mark VRAM dirty for display capture
commit e8778fa2f429c6df0eece19d6a5ee83ae23a0cf4
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Sat Nov 28 18:59:31 2020 +0100
use flat VRAM for textures and texpals
also skip rendering if nothing changed and a bunch of fixes
commit 53f2041e2e1a28b35702a2ed51de885c36689f71
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Fri Nov 27 18:29:56 2020 +0100
use vram dirty tracking for extpals
also preparations to take this further
commit 4cdfa329e95aed26d3b21319c8fd86a04abf20f7
Author: RSDuck <rsduck@users.noreply.github.com>
Date: Mon Nov 16 23:32:22 2020 +0100
VRAM dirty tracking
* clip against Z then Y then X. apparently, fixes#310. I had also observed hints that the hardware does it this way.
* truncate W to 24 bits before viewport transform.
* mark any polygons that have a W=0 at that point as degenerate. do not render.
this, by a fucking shitshow of butterfly effect, ends up fixing #234. technically, the rasterizer was going out of bounds, which, under certain circumstances, caused interpolation to shit itself and generate Z values that were out of range (but still ended up in the zbuffer). sometimes those values ended up negative, which caused these glitches when polygons had to be drawn over those.
about fucking time.