Introducing the compute renderer
To make things short, I've been working on porting melonDS's software rasteriser to run on the GPU via compute shaders. So how is this different to melonDS's existing OpenGL renderer? The OpenGL renderer uses builtin functionality of your GPU to draw triangles. This is of course fast, since it uses hardware specifically made for this, but it has the downside that some things can't be controlled by us, so the behaviour of the DS can't be replicated completely faithfully. On the otherhand this only utilises the programmable parts of the GPU (which means we have full control over them), so it's like the software rasteriser, only it utilises the parallel computing power of GPUs. Ideally it should be eable to be just as accurate as the software rasetriser is.

Why are we doing this in the first place?
  • Enhancements such as higher resolution rendering at reasonable speeds compared to say a software rasteriser, but with less problems than the OpenGL renderer (though problems can never be fully excluded when running games differently than they were intended).

  • Fullspeed emulation of 3D games on Switch and potentially other devices which fit this weird niche where they have slow processors but pretty competent GPUs and good software side support for it.

You might have already heard of parallel-rdp from Themaister which provides a very accurate emulation of the RDP (i.e. that part of the N64 which in the end draws the triangles) running on the GPU. It has been a great inspiration for this project (which means where possible it's basically a clone). So thanks to Themaister for all the ideas and also for answering my questions!

Currently the main part of the work is done (it's already somewhat playable with a lot of games), so it's easier to list what's still missing:

  • Blending

  • Shadows

  • Equal depth testing

  • Antialiasing

  • Highlighting/Toon shading

  • Fog

  • Edgemarking

  • Rearimages

I plan on detailing some technical aspects later. Also I have not forgetten my A tour through melonDS's JIT recompiler "series", so expect to see some more posts by me here sooner or later.
Sorer says:
May 1st 2021
Very nice job!
Does this gonna use Vulkan too like parallel-rdp? or it gonna be OpenGL only? not clear from the post.
Generic aka RSDuck says:
May 1st 2021
I'm currently developing it directly for Switch with deko3d which is the homebrew platform specific API there.

When porting it over to the PC it will probably use OpenGL because that's what I'm familiar with and what's already integrated into melonDS. For workloads like this the API is relatively irrelevant anyway.
Hashibee says:
May 1st 2021
I love the dedication man. Keep up the awesome work :)
NICOEMU says:
May 1st 2021
Nice job!
I hope they achieve their goal! :D
poudink says:
May 1st 2021
Would this be slower than the current OpenGL renderer?
Sorer says:
May 1st 2021
I remember parallel-rdp being slower on my laptop while other renderers being faster for n64 emulators.
But this is the DS we are talking about here lol.
Generic aka RSDuck says:
May 1st 2021
especially at higher resolutions it will definitely be slower than the OpenGL renderer. It also requires "newer" hw for compute shaders (that would be atleast OpenGL 4.3).
Pents says:
May 2nd 2021
will it allow for drawing quads?
poudink says:
May 2nd 2021
Yeah.
Comlud2 says:
May 2nd 2021
This is great and all, but I'm just saying, why aren't you guys focusing on networking? Local multiplayer in particular. That's what most of us are waiting for.
Generic aka RSDuck says:
May 2nd 2021
how are you so sure, did you poll the entire melonDS userbase? We'll get there eventually.
Guest says:
May 2nd 2021
Interestingly, Nouveau is seeing OpenGL compute shader bringup on NV50 lately, so if compute shaders would be possible with OpenGL ES 3.1, we could, at least in theory, see compute shader support on 8-series GeForce cards, at least on Linux.
Generic aka RSDuck says:
May 2nd 2021
whether it will run well is of course another question :D

Though Switch for example is a low powered Maxwell chip and it runs it pretty well, so who knows how it will run on older, but higher clocked chips.
ari32 says:
May 2nd 2021
Flip, I don't think I even care that much about the DS anymore. I just come here to see the amazing byproduct of pure dedication from a small team.
I think one day many years from now there will be kids emulating melonDS on quantum laptops, or something like that.
Marv says:
May 5th 2021
https://dl.acm.org/doi/10.1111/j.1467-8659.2010.01725.x


For handling blending, using linked lists might be of interest to you ...
Post a comment
Name:
DO NOT TOUCH