This will generate one shader per copy format. For now, it is the same
shader with the colmat hard coded. So it should already improve the GPU
performance a bit, but a rewrite of the shader generator is suggested.
Half of the patch is done by linkmauve1:
VideoCommon: Reorganise the shader writes.
These rely on instance state, or are used within instance-based class
member functions, so they should belong to the instance itself instead
of being file statics.
Improve bookkeeping around formats. Hopefully make code less confusing.
- Rename TlutFormat -> TLUTFormat to follow conventions.
- Use enum classes to prevent using a Texture format where an EFB Copy format
is expected or vice-versa.
- Use common EFBCopyFormat names regardless of depth and YUV configurations.
The loop was allocating one-too-many levels, as well as incorrect sizes
for each level. Probably not an issue as mipmapped render targets aren't
used, but the logic should be correct anyway.
The spec says that vendors can set the max texture size to be 65KB and we want 1MB.
Check the maximum supported and drop to the max if it is less than 1MB
Instead of having special case code for efb2tex that ignores hashes,
the only diffence between efb2tex and efb2ram now is that efb2tex
writes zeros to the memory instead of actual texture data.
Though keep in mind, all efb2tex copies will have hashes of zero as
their hash.
Addded a few duplicated depth copy texture formats to the enum
in TextureDecoder.h. These texture formats were already implemented
in TextureCacheBase and the ogl/dx11 texture cache implementations.
We are used to use the texture parameter for all util draw calls,
but AMD seems to have a bug where they use the sampler parameter
of stage 0 if no sampler is bound to the used stage.
So as workaround (and a bit as nicer code), we now use sampler
objects everywhere.