melonDS aims at providing fast and accurate Nintendo DS emulation. While it is still a work in progress, it has a pretty solid set of features:

• Nearly complete core (CPU, video, audio, ...)
• JIT recompiler for fast emulation
• OpenGL renderer, 3D upscaling
• RTC, microphone, lid close/open
• Joystick support
• Savestates
• Various display position/sizing/rotation modes
• (WIP) Wifi: local multiplayer, online connectivity
• (WIP) DSi emulation
• DLDI
• (WIP) GBA slot add-ons
• and more are planned!







Download melonDS

If you're running into trouble: Howto/FAQ
Little status update
So yeah, it's been a while since the new OpenGL renderer was merged in...

Haven't been very active with melonDS since then. I've mostly been taking a well-deserved break from all this intensive coding.

Real life is catching up, too. Mental health stuff. Things coming up that I have to take care of.

Add a minor Hytale addiction to the mix, and... yeah.


I'm going to post a few notes about the future of melonDS at large.


First, I've been toying with a Golden Sun hack for the OpenGL renderer.


(click for full-size version)

Flicker-free hi-res. When I had first attempted it, there was some flickering, which was caused by color conversion issues that have since been fixed, so I figured I could give it another try.

It is a gross hack, but a nice proof of concept regardless. It doesn't address the performance issue (which will need a separate fix), but it fixes up the upscaling issue by replicating what Golden Sun does in a high-level manner.

... read more
blackmagic3 merged!
After much testing and bugfixing, the blackmagic3 branch has finally been merged. This means that the new OpenGL renderer is now available in the nightly builds.

Our plan is to let it cool down and potentially fix more issues before actually releasing melonDS 1.2.

So we encourage you to test out this new renderer and report any issues you might observe with it.

There are issues we're already aware of, and for which I'm working on a fix. For example, FMVs that rely on VRAM streaming, or Golden Sun. There may be tearing and/or poor performance for now. I have ideas to help alleviate this, but it'll take some work.

There may also be issues that are inherent to the classic OpenGL 3D renderer, so if you think you've encountered a bug, please verify against melonDS 1.1.

If you run into any problem, we encourage you to open an issue on Github, or post on the forums. The comment section on this blog isn't a great place for reporting bugs.


Either way, have fun!
Golden Sun: Dark Destruction
It's no secret that every game console to be emulated will have its set of particularly difficult games. Much like Dolphin with the Disney Trio of Destruction, we also have our own gems. Be it games that push the hardware to its limits in smart and unique ways, or that just like to do things in atypical ways, or just games that are so poorly coded that they run by sheer luck.

Most notably: Golden Sun: Dark Dawn.

This game is infamous for giving DS emulators a lot of trouble. It does things such as running its own threading system (which maxes out the ARM9), abusing surround mode to mix in sound effects, and so on.

I bring this up because it's been reported that my new OpenGL renderer struggles with this branch. Most notably, it runs really slow and the screens flicker. Since the blackmagic3 branch is mostly finished, and I'm letting it cool down before merging it, I thought, hey, why not write a post about this? It's likely that I will try to fix Golden Sun atleast enough to make it playable, but that will be after blackmagic3 is merged.


So, what does Golden Sun do that is so hard?

It renders 3D graphics to both screens. Except it does so in an atypical way.

If you run Golden Sun in melonDS 1.1, you find out that upscaling just "doesn't work". The screens are always low-res. Interestingly, in this scenario, DeSmuME suffers from melonDS syndrome: the screens constantly flicker between the upscaled graphics and their low-res versions. NO$GBA also absolutely hates what this game is doing.

So what is going on there?

Normally, when you do dual-screen 3D on the DS, you need to reserve VRAM banks C and D for display capture. You need those banks specifically because they're the only banks that can be mapped to the sub 2D engine and that are big enough to hold a 256x192 direct color bitmap. You need to alternate them because you cannot use a VRAM bank for BG layers or sprites and capture to it at the same time, due to the way VRAM mapping works.

... read more
Becoming a master mosaicist
So basically, mosaic is the last "big" feature that needs to be added to the OpenGL renderer...

Ah, mosaic.

I wrote about it here, back then. But basically, while BG mosaic is mostly well-behaved, sprite mosaic will happily produce a fair amount of oddities.

Actual example, from hardware:



I had already noticed this back then, as I was trying to perfect sprite mosaic, but didn't quite get there. I ended up putting it on the backburner.


Fast forward to, well, today. As I said, I need to add mosaic support to the OpenGL renderer, but before doing so, I want to be sure to get the implementation right in the software renderer.

So I decided to tackle this once and for all. I developed a basic framework for graphical test ROMs, which can setup a given scene, capture it to VRAM, and compare that against a reference capture acquired from hardware. While the top screen shows the scene itself, the bottom screen shows either the reference capture, or a simple black/white compare where any different pixels immediately stand out.

I built a few test ROMs out of this. One tests various cases, ranging from one sprite to several, and even things like changing REG_MOSAIC midframe. One tests sprite priority orders specifically. One is a fuzzing test, which just places a bunch of sprite with random coordinates and priority orders.

... read more
blackmagic3: refactor finished!
The blackmagic3 branch, a perfect example of a branch that has gone way past its original scope. The original goal was to "simply" implement hi-res display capture, and here we are, refactoring the entire thing.

Speaking of refactor, it's mostly finished now. For a while, it made a mess of the codebase, with having to move everything around, leaving the codebase in a state where it wouldn't even compile, it even started to feel overwhelming. But now things are good again!


As I said in the previous post, the point of the refactor was to introduce a global renderer object that keeps the 2D and 3D renderers together. The benefits are twofold.

First, this is less prone to errors. Before, the frontend had to separately create 2D and 3D renderers for the melonDS core, with the possibility of having mismatched renderers. Having a unique renderer object avoids this entirely, and is overall easier to deal with for the frontend.

Second, this is more accurate to hardware. Namely, the code structure more closely adheres to the way the DS hardware is structured. This makes it easier to maintain and expand, and more accurate in a natural kind of way. For example, implementing the POWCNT1 enable bits is easier. The previous post explains this more in detail, so I won't go over it again.


I've also been making changes to the way 2D video registers are updated, and to the state machine that handles video timing, with the same aim of more closely reflecting the actual hardware operation. This will most likely not result in tangible improvements for the casual gamer, but if we can get more accurate with no performance penalty, that's a win. Reason is simple: in emulation, the more closely you follow the original hardware's logic and operation, the less likely you are to run into issues. But accuracy is also a tradeoff. I could write a more detailed post about this.

If there are any tangible improvements, they will be about the mosaic feature, especially sprite mosaic, which melonDS still doesn't handle correctly. Not much of a big deal, mosaic seems seldom used in DS games...


Since we're talking about accuracy, it brings me to this issue: 1-byte writes to DISPSTAT don't work.

A simple case: 8-bit writes to DISPSTAT are just not handled. I addressed it in blackmagic3, as I was reworking that part of the code. But it's making me think about better ways to handle this.

... read more
OpenGL renderer: status update, and "fun" stuff
First of all, we wish you a happy new year. May 2026 be filled with success.

Next, well... DS emulation, oh what fun.

It all started as I was contemplating what was left to be done with the OpenGL renderer. At this point, I've mostly managed to turn this pile of hacks into an actual, proper renderer. While there's still a lot of stuff to verify, clean up, optimize, the core structure is there. It may not support every edge case, but I believe it will do a pretty decent job.

The main remaining things to do are: add mosaic, add the "forced blank" and POWCNT1 2D engine enable flags, and restructure the 2D rendering code. As it turns out, those things are somewhat interconnected.


Mosaic was discussed here. Basically, it applies a pixelation effect to 2D graphics. BG mosaic shouldn't be difficult to add to the OpenGL renderer, but sprite mosaic is going to be tricky, because the way it works makes sense if you're processing pixels from left to right, but that's not how GPU shaders work. It will probably require some shitty code to get it right.

"Forced blank" is a control bit in DISPCNT. What it does is force the 2D engine to render a blank picture, which allows the CPU to access VRAM faster.

POWCNT1 made an appearance here. This register has enable bits for the two 2D engines, which have a bit of a similar effect to the aforementioned "forced blank" bit. However it's not quite the same.

For example, I was testing all those features to make sure I understood them correctly before implementing them, and it highlighted some shortcomings in melonDS's software renderer.

Take affine BG layers. The way they work is that you have reference point registers (BGnX and BGnY) determining which point of the layer will be in the screen's top-left corner, then you have a 2x2 matrix (BGnPA to BGnPD) which transforms screen coordinates to layer coordinates. This is how arbitrary rotation and scaling is created. It is similar to mode 7 on the SNES.

... read more
Merry Christmas from melonDS
I know, I'm one day late.

There won't be anything very fancy for now, though. I haven't worked much on melonDS lately. Over the last week, my training period has been intense, and the long commute didn't help either. However, that went quite well, so I'm hopeful.

Another thing is, there isn't much more to show at this point.

What remains to be done is a bunch of cleanup, optimization work, and adding certain missing features, like proper chunked rendering if certain registers are changed mid-frame. Much like what the blargSNES hardware renderer does. Basically, turning this whole experimental setup into a finished product. It's always nice, but it doesn't really make for interesting screenshots.

As far as the 3D renderers are concerned, my work is mostly finished. I have ideas to improve the regular OpenGL renderer, but that's beyond the scope of the blackmagic3 branch for now. I could mention some, though: for example, attempting to add proper interpolation, instead of the "improved polygon splitting" hack. There would be some other old issues to fix in order to bring both OpenGL renderers closer to the software renderer, too.

With all this, I hope to be able to release melonDS 1.2 by early 2026. We'll see how this goes.

Either way, merry Christmas, happy new year, and all!
Why make it simple when it can be complicated?
First, a bit of a status update. I'm about to begin a training period of sorts, that will (hopefully!) lead to an actual job next year. It looks like a pretty nice job for me, too!

So that part is sorted out, for now.

Next, the OpenGL renderer is still in the works, although I've been taking a bit of a break from it.

But hey, when you happen to be stuck on a train for way longer than planned, what can you do? That's right, add features to your renderer! The latest fun feature I added is also the reason behind this post title.

The DS's 3D renderer has registers that set the clear color and depth, and a bunch of other clear attributes - basically what the color/depth/attribute buffers will be initialized to before drawing polygons. Much like glClearColor() and glClearDepth().

However, you can also use a clear bitmap. In this case, two 128K "slots" of texture VRAM (out of a total of 4) will be used to initialize the color and depth buffers: slot 2 for the color buffer, slot 3 for the depth buffer. The bitmaps are 256x256, and they can be scrolled. It isn't a widely used feature, but there are games that use it. As far as melonDS is concerned, the software renderer supports it, but it had always been missing from the OpenGL renderers.

Since I was busy adding features to the compute shader renderer, I thought, hey, why not add this? Shouldn't be very difficult. And indeed, it wasn't. But the game I used to test it threw me for a loop.



This is Rayman Raving Rabbids 2. In particular, this is the screen that shows up after you've played one of the minigames. The bottom screen uses the clear bitmap feature - without it, the orange frame layer won't show up at all.

... read more
Hardware renderer progress
Hey hey, little status update! I've been having fun lately.



The hardware renderer has been progressing nicely lately. It's always exciting when you're able to assemble the various parts you've been working on into something coherent, and it suddenly starts looking a lot like a finished product. It's no longer just something I'm looking at in RenderDoc.

Those screenshots were taken with 4x upscaling (click them for full-size versions). The last one demonstrates hi-res rotscale in action. Not bad, I dare say.

It's not done yet, though, so I'll go over what remains to be done.


Mosaic

Shouldn't be very difficult to add.

Except for, you know, sprite mosaic. I don't really know yet how I'll handle that one. The way it works is intuitive if you're processing pixels left-to-right within a scanline, but this isn't how a modern GPU works.


Display capture

... read more
Hardware rendering, the fun
This whole thing I'm working on gives me flashbacks from blargSNES. The goal and constraints are different, though. We weren't doing upscaling on the 3DS, but also, we had no fragment shaders, so we were much more limited in what we could do.

Anyway, these days, I'm waist-deep into OpenGL. I'm determined to go further than my original approach to upscaling, and it's a lot of fun too.

I might as well talk more about that approach, and what its limitations are.

First, let's talk about how 2D layers are composited on the DS.

There are 6 basic layers: BG0, BG1, BG2, BG3, sprites (OBJ) and backdrop. Sprites are pre-rendered and treated as a flat layer (which means you can't blend a sprite with another sprite). Backdrop is a fixed color (entry 0 of the standard palette), which basically fills any space not occupied by another layer.

For each pixel, the PPU keeps track of the two topmost layers, based on priority orders.

Then, you have the BLDCNT register, which lets you choose a color effect to be applied (blending or fade effects), and the target layers it may apply to. For blending, the "1st target" is the topmost pixel, and the "2nd target" is the pixel underneath. If the layers both pixels belong to are adequately selected in BLDCNT, they will be blended together, using the coefficients in the BLDALPHA register. Fade effects work in a similar fashion, except since they only apply to the topmost pixel, there's no "2nd target".

Then you also have the window feature, which can exclude not only individual layers from a given region, but can also disable color effects. There are also a few special cases: semi-transparent sprites, bitmap sprites, and the 3D layer. Those all ignore the color effect and 1st target selections in BLDCNT, as well as the window settings.

In melonDS, the 2D renderer renders all layers according to their priority order, and keeps track of the last two values for each pixel: when writing a pixel, the previous value is pushed down to a secondary buffer. This way, at the end, the two buffers can be composited together to form the final video frame.

... read more