melonDS aims at providing fast and accurate Nintendo DS emulation. While it is still a work in progress, it has a pretty solid set of features:

• Nearly complete core (CPU, video, audio, ...)
• JIT recompiler for fast emulation
• OpenGL renderer, 3D upscaling
• RTC, microphone, lid close/open
• Joystick support
• Savestates
• Various display position/sizing/rotation modes
• (WIP) Wifi: local multiplayer, online connectivity
• (WIP) DSi emulation
• DLDI
• (WIP) GBA slot add-ons
• and more are planned!







Download melonDS

If you're running into trouble: Howto/FAQ
Becoming a master mosaicist
So basically, mosaic is the last "big" feature that needs to be added to the OpenGL renderer...

Ah, mosaic.

I wrote about it here, back then. But basically, while BG mosaic is mostly well-behaved, sprite mosaic will happily produce a fair amount of oddities.

Actual example, from hardware:



I had already noticed this back then, as I was trying to perfect sprite mosaic, but didn't quite get there. I ended up putting it on the backburner.


Fast forward to, well, today. As I said, I need to add mosaic support to the OpenGL renderer, but before doing so, I want to be sure to get the implementation right in the software renderer.

So I decided to tackle this once and for all. I developed a basic framework for graphical test ROMs, which can setup a given scene, capture it to VRAM, and compare that against a reference capture acquired from hardware. While the top screen shows the scene itself, the bottom screen shows either the reference capture, or a simple black/white compare where any different pixels immediately stand out.

I built a few test ROMs out of this. One tests various cases, ranging from one sprite to several, and even things like changing REG_MOSAIC midframe. One tests sprite priority orders specifically. One is a fuzzing test, which just places a bunch of sprite with random coordinates and priority orders.

... read more
blackmagic3: refactor finished!
The blackmagic3 branch, a perfect example of a branch that has gone way past its original scope. The original goal was to "simply" implement hi-res display capture, and here we are, refactoring the entire thing.

Speaking of refactor, it's mostly finished now. For a while, it made a mess of the codebase, with having to move everything around, leaving the codebase in a state where it wouldn't even compile, it even started to feel overwhelming. But now things are good again!


As I said in the previous post, the point of the refactor was to introduce a global renderer object that keeps the 2D and 3D renderers together. The benefits are twofold.

First, this is less prone to errors. Before, the frontend had to separately create 2D and 3D renderers for the melonDS core, with the possibility of having mismatched renderers. Having a unique renderer object avoids this entirely, and is overall easier to deal with for the frontend.

Second, this is more accurate to hardware. Namely, the code structure more closely adheres to the way the DS hardware is structured. This makes it easier to maintain and expand, and more accurate in a natural kind of way. For example, implementing the POWCNT1 enable bits is easier. The previous post explains this more in detail, so I won't go over it again.


I've also been making changes to the way 2D video registers are updated, and to the state machine that handles video timing, with the same aim of more closely reflecting the actual hardware operation. This will most likely not result in tangible improvements for the casual gamer, but if we can get more accurate with no performance penalty, that's a win. Reason is simple: in emulation, the more closely you follow the original hardware's logic and operation, the less likely you are to run into issues. But accuracy is also a tradeoff. I could write a more detailed post about this.

If there are any tangible improvements, they will be about the mosaic feature, especially sprite mosaic, which melonDS still doesn't handle correctly. Not much of a big deal, mosaic seems seldom used in DS games...


Since we're talking about accuracy, it brings me to this issue: 1-byte writes to DISPSTAT don't work.

A simple case: 8-bit writes to DISPSTAT are just not handled. I addressed it in blackmagic3, as I was reworking that part of the code. But it's making me think about better ways to handle this.

... read more
OpenGL renderer: status update, and "fun" stuff
First of all, we wish you a happy new year. May 2026 be filled with success.

Next, well... DS emulation, oh what fun.

It all started as I was contemplating what was left to be done with the OpenGL renderer. At this point, I've mostly managed to turn this pile of hacks into an actual, proper renderer. While there's still a lot of stuff to verify, clean up, optimize, the core structure is there. It may not support every edge case, but I believe it will do a pretty decent job.

The main remaining things to do are: add mosaic, add the "forced blank" and POWCNT1 2D engine enable flags, and restructure the 2D rendering code. As it turns out, those things are somewhat interconnected.


Mosaic was discussed here. Basically, it applies a pixelation effect to 2D graphics. BG mosaic shouldn't be difficult to add to the OpenGL renderer, but sprite mosaic is going to be tricky, because the way it works makes sense if you're processing pixels from left to right, but that's not how GPU shaders work. It will probably require some shitty code to get it right.

"Forced blank" is a control bit in DISPCNT. What it does is force the 2D engine to render a blank picture, which allows the CPU to access VRAM faster.

POWCNT1 made an appearance here. This register has enable bits for the two 2D engines, which have a bit of a similar effect to the aforementioned "forced blank" bit. However it's not quite the same.

For example, I was testing all those features to make sure I understood them correctly before implementing them, and it highlighted some shortcomings in melonDS's software renderer.

Take affine BG layers. The way they work is that you have reference point registers (BGnX and BGnY) determining which point of the layer will be in the screen's top-left corner, then you have a 2x2 matrix (BGnPA to BGnPD) which transforms screen coordinates to layer coordinates. This is how arbitrary rotation and scaling is created. It is similar to mode 7 on the SNES.

... read more
Merry Christmas from melonDS
I know, I'm one day late.

There won't be anything very fancy for now, though. I haven't worked much on melonDS lately. Over the last week, my training period has been intense, and the long commute didn't help either. However, that went quite well, so I'm hopeful.

Another thing is, there isn't much more to show at this point.

What remains to be done is a bunch of cleanup, optimization work, and adding certain missing features, like proper chunked rendering if certain registers are changed mid-frame. Much like what the blargSNES hardware renderer does. Basically, turning this whole experimental setup into a finished product. It's always nice, but it doesn't really make for interesting screenshots.

As far as the 3D renderers are concerned, my work is mostly finished. I have ideas to improve the regular OpenGL renderer, but that's beyond the scope of the blackmagic3 branch for now. I could mention some, though: for example, attempting to add proper interpolation, instead of the "improved polygon splitting" hack. There would be some other old issues to fix in order to bring both OpenGL renderers closer to the software renderer, too.

With all this, I hope to be able to release melonDS 1.2 by early 2026. We'll see how this goes.

Either way, merry Christmas, happy new year, and all!
Why make it simple when it can be complicated?
First, a bit of a status update. I'm about to begin a training period of sorts, that will (hopefully!) lead to an actual job next year. It looks like a pretty nice job for me, too!

So that part is sorted out, for now.

Next, the OpenGL renderer is still in the works, although I've been taking a bit of a break from it.

But hey, when you happen to be stuck on a train for way longer than planned, what can you do? That's right, add features to your renderer! The latest fun feature I added is also the reason behind this post title.

The DS's 3D renderer has registers that set the clear color and depth, and a bunch of other clear attributes - basically what the color/depth/attribute buffers will be initialized to before drawing polygons. Much like glClearColor() and glClearDepth().

However, you can also use a clear bitmap. In this case, two 128K "slots" of texture VRAM (out of a total of 4) will be used to initialize the color and depth buffers: slot 2 for the color buffer, slot 3 for the depth buffer. The bitmaps are 256x256, and they can be scrolled. It isn't a widely used feature, but there are games that use it. As far as melonDS is concerned, the software renderer supports it, but it had always been missing from the OpenGL renderers.

Since I was busy adding features to the compute shader renderer, I thought, hey, why not add this? Shouldn't be very difficult. And indeed, it wasn't. But the game I used to test it threw me for a loop.



This is Rayman Raving Rabbids 2. In particular, this is the screen that shows up after you've played one of the minigames. The bottom screen uses the clear bitmap feature - without it, the orange frame layer won't show up at all.

... read more
Hardware renderer progress
Hey hey, little status update! I've been having fun lately.



The hardware renderer has been progressing nicely lately. It's always exciting when you're able to assemble the various parts you've been working on into something coherent, and it suddenly starts looking a lot like a finished product. It's no longer just something I'm looking at in RenderDoc.

Those screenshots were taken with 4x upscaling (click them for full-size versions). The last one demonstrates hi-res rotscale in action. Not bad, I dare say.

It's not done yet, though, so I'll go over what remains to be done.


Mosaic

Shouldn't be very difficult to add.

Except for, you know, sprite mosaic. I don't really know yet how I'll handle that one. The way it works is intuitive if you're processing pixels left-to-right within a scanline, but this isn't how a modern GPU works.


Display capture

... read more
Hardware rendering, the fun
This whole thing I'm working on gives me flashbacks from blargSNES. The goal and constraints are different, though. We weren't doing upscaling on the 3DS, but also, we had no fragment shaders, so we were much more limited in what we could do.

Anyway, these days, I'm waist-deep into OpenGL. I'm determined to go further than my original approach to upscaling, and it's a lot of fun too.

I might as well talk more about that approach, and what its limitations are.

First, let's talk about how 2D layers are composited on the DS.

There are 6 basic layers: BG0, BG1, BG2, BG3, sprites (OBJ) and backdrop. Sprites are pre-rendered and treated as a flat layer (which means you can't blend a sprite with another sprite). Backdrop is a fixed color (entry 0 of the standard palette), which basically fills any space not occupied by another layer.

For each pixel, the PPU keeps track of the two topmost layers, based on priority orders.

Then, you have the BLDCNT register, which lets you choose a color effect to be applied (blending or fade effects), and the target layers it may apply to. For blending, the "1st target" is the topmost pixel, and the "2nd target" is the pixel underneath. If the layers both pixels belong to are adequately selected in BLDCNT, they will be blended together, using the coefficients in the BLDALPHA register. Fade effects work in a similar fashion, except since they only apply to the topmost pixel, there's no "2nd target".

Then you also have the window feature, which can exclude not only individual layers from a given region, but can also disable color effects. There are also a few special cases: semi-transparent sprites, bitmap sprites, and the 3D layer. Those all ignore the color effect and 1st target selections in BLDCNT, as well as the window settings.

In melonDS, the 2D renderer renders all layers according to their priority order, and keeps track of the last two values for each pixel: when writing a pixel, the previous value is pushed down to a secondary buffer. This way, at the end, the two buffers can be composited together to form the final video frame.

... read more
melonDS 1.1 is out!
As promised, here is the new release: melonDS 1.1.

So, what's new in this release?

EDIT - there was an issue with the release builds that had been posted, so if your JIT option is greyed out and you're not using a x64 Mac, please redownload the release.


DSP HLE

This is going to be a big change for DSi gamers out there.

If you've been playing DSi titles in melonDS, you may have noticed that sometimes they run very slow. Single-digit framerates. Wouldn't be a big deal if melonDS was always this slow, but obviously, it generally performs much better, so this sticks out like a sore thumb.

This is because those titles use the DSi's DSP. What is the DSP, you ask? A specific-purpose (read: weird) processor that doesn't actually do much besides being very annoying and resource-intensive to emulate. They use it for such tasks as downscaling pictures or playing a camera shutter sound when you take a picture.

With help from CasualPokePlayer, we were able to figure out the 3 main classes of DSP ucodes those games use, determine their functionality, and implement HLE equivalents in melonDS. Thus, those wonderful DSP features can be emulated without utterly wrecking performance.

DSP HLE is a setting, which you will find in the emulation settings dialog, DSi-mode tab. It is enabled by default.

... read more
Hi-res display capture: we're getting there!
Sneak peek of the blackmagic3 branch:


(click them for full-res versions)

Those are both dual-screen 3D scenes, but notice how both screens are nice and smooth and hi-res.

Now, how far along are we actually with this?

As I said in the previous post, this is an improved version of the old renderer, which was based on a simple but limited approach. At the time, it was easy enough to hack that on top of the existing 2D engine. But now, we're reaching the limits of what is possible with this approach. So, consider this a first step. The second step will be to build a proper OpenGL-powered 2D engine, which will open up more crazy possibilities as far as graphical enhancements go.

I don't know if this first step will make it in melonDS 1.1, or if it will be for 1.2. Turns out, this is already a big undertaking.

I added code to keep track of which VRAM blocks are used for display captures. It's not quite finished, it's missing some details, like freeing capture buffers that are no longer in use, or syncing them with emulated VRAM if the CPU tries to access VRAM.

It also needs extensive testing and optimization. For this first iteration, for once, I tried to actually build something that works, rather than spend too much time trying to imagine the perfect design. So, basically, it works, but it's inefficient... Of course, the sheer complexity of VRAM mapping on the DS doesn't help at all. Do you remember? You can make the VRAM banks overlap!

So, yeah. Even if we end up making a new renderer, all this effort won't go to waste: we will have the required apparatus for hi-res display capture.

... read more
Display capture: oh, the fun!
This is going to be a juicy technical post related to what I'm working on at the moment.

Basically, if you've used 3D upscaling in melonDS, you know that there are games where it doesn't work. For example, games that display 3D graphics on both screens: they will flicker between the high-resolution picture and a low-resolution version. Or in other instances, it might just not work at all, and all you get is low-res graphics.

It's linked to the way the DS video hardware works. There are two 2D tile engines, but there is only one 3D engine. The output from that 3D engine is sent to the main 2D engine, where it is treated as BG0. You can also change which screen each 2D engine is connected to. But you can only render 3D graphics on one screen.

So how do those games render 3D graphics on both screens?

This is where display capture comes in. The principle is as follows: you render a 3D scene to the top screen, all while capturing that frame to, say, VRAM bank B. On the next frame, you switch screens and render a 3D scene to the bottom screen. Meanwhile, the top screen will display the previously captured frame from VRAM bank B, and the frame you're rendering will get captured to VRAM bank C. On the next frame, you render 3D graphics to the top screen again, and the bottom screen displays the capture from VRAM bank C. And so on.

This way, you're effectively rendering 3D graphics to both screens, albeit at 30 FPS. This is a typical use case for display capture, but not the only possiblity.

Display capture can receive input from two sources: source A, which is either the main 2D engine output or the raw 3D engine output, and source B, which is either a VRAM bank or the main memory display FIFO. Then you can either select source A, or source B, or blend the two together. The result from this will be written to the selected output VRAM bank. You can also choose to capture the entire screen or a region of it (128x128, 256x64 or 256x128).

All in all, quite an interesting feature. You can use it to do motion blur effects in hardware, or even to render graphics to a texture. Some games even do software processing on captured frames to apply custom effects. It is also used by the aging cart to verify the video hardware: it renders a scene and checksums the captured output.

For example, here's a demo of render-to-texture I just put together, based on the libnds examples:

... read more