melonDS 0.6
I'm lazy, and there are little visual changes, so I will reuse those screenshots.

So what's new in melonDS 0.6? Little emulation wise, a bit more UI wise.

First of all, I want to thank the artists who have been (and are still) drawing all sorts of rad icons for melonDS. For 0.7, I will pick the one I like the best (and it won't be easy, heh). Thing is, I want to put the icon in the melonDS windows, and I will need to add support to libui. Which also means embedding the icons in some portable format, because each OS does its own thing when it comes to window icons.

Emulation wise, the big thing is the sound fix I talked about in a previous blog post. I already went in detail over this, but, long story short, surround works now. And sound emulation is more accurate, that can only be good.

There. The rest is meaningless shenanigans.

UI wise, you get fancy display modes now. Those were also discussed in a previous blog post, so no big surprise there.

The only thing that was added is a toggle for linear filtering, for those who like pixels.

The rest is, well, little bug fixes. Under Windows, you can now load ROMs with non-ASCII characters in their paths. As a side note, under Linux and OSX, fopen() can take UTF8 paths, but Windows requires a separate codepath because herpderp. fopen() can only take ASCII, for anything outside of that you need to use the Windows-only _wfopen() which takes wide-char strings. In the end, the code is a bit ugly, but it works.

That's about it for this release. But stay tuned, 0.7 should bring in some Christmas fun.

Windows 64-bit
Linux 64-bit

melonDS Patreon
Another of those long-due fixes
The sound core is fixed, or mostly.

If you've been playing with melonDS, you may have noticed that effects like surround generally didn't work, and that whenever they worked, they resulted in crappy sound. Resetting or loading different games would also give different results.

So I've been addressing all that junk.

The randomness was easily fixed -- the write position for capture units wasn't being reset properly, so whenever the capture unit was started, it would start writing wherever it was previously left. Not really a good thing.

The rest was trickier though.

The DS sound controller provides 16 channels that can play a few types of audio, and 2 capture units that can capture sound output and write it to memory.

Several games provide different audio modes, typically headphones/stereo/surround. In those games, the audio output isn't directly sent to the speakers, but it is sent to the capture units. The game then alters it as it wishes, and two special channels (1 and 3) are used to send the final output to the speakers.

The mechanism is generally used for the aforementioned audio modes, but some games take it further -- Golden Sun: Dark Dawn goes as far as using it to play its sound effects.

While the mechanism (which we will name "surround setup") may sound simple, getting it to work is actually tricky.

It has been known for a while why it didn't work in melonDS: capture and channel 1/3 playback were done in the wrong order. However, putting them in the right order resulted in distorted, tinny sound. So I just kept it in the wrong order and ignored it.

Until now.

When you start a sound channel on the DS, it doesn't start playing immediately, there is a delay of a few samples. However, capture units start immediately.

The surround setup starts the capture units and their associated output channels at the same time, and gives them the same memory address. The idea is that, since channels are processed before capture units, the output channel reads sound data before the capture unit writes new data, and the distance is the largest possible, as to give the game enough time to apply effects to it.

However, the channel starts with a delay, which means it's lagging behind a little. In the case of melonDS, as the mixer processed 16 samples per run, the output channels would get bits of freshly captured data mixed in with the older data as a result of this. You guess how this goes.

Reducing the number of samples per run to 2 or 1 got rid of the audio distortion, but surround didn't work at all since the channels would only get freshly captured data.

Removing the channel start delay fixed the problem, but obviously it's not proper emulation.

The killer detail is that when the sound controller needs to access memory, it doesn't do so directly, but uses a buffering mechanism (FIFO) for each channel and capture unit. GBAtek mentions their existence, but... that's all.

The FIFOs need to be emulated for surround to work properly.

Which means that I had to do hardware tests to figure out how it works. There are probably still missing details.

Channel FIFOs can hold 8 words (32 bytes) of data. They are filled in chunks of 4 words when needed. An interesting detail is that when starting a channel, two chunks are buffered. Then it seems that a new chunk is buffered every time the FIFO runs less than half full.

Capture unit FIFOs are simple write buffers that hold 4 words. They are flushed to memory either when they're full of when the end of the capture length is reached (regardless of whether the capture loops).

So after implementing the whole FIFO thing, we can finally say that a bunch of sound issues are fixed! Surround works, sound effects in Golden Sun: Dark Dawn work too, some crackling that appeared in Rayman RR2 is gone, etc...
How far we have gone, and where we are going next
melonDS has been going for one year now.

Well, the repo was opened in May 2016, but the real work started in November 2016.

I didn't really know what I was in for, but hey, I wanted to have fun making something new. There wasn't much else in terms of goals, other than the famous goal of successfully emulating local wifi.

Of course, getting an emulator project going is initally a gruelling task, as it takes quite a bit of work before you can get interesting results. In the case of the DS, as I went with the goal of emulating the BIOS boot procedure, there was quite a lot to be done before the firmware would try to display things. In the case of melonDS, it took about 2.5 months to get there, but I don't consider it a good idea to look at durations. I was unemployed at the time (or rather, waiting to start that job), I'm a lazy fuck, and a couple vicious CPU bugs got in the way.

The first release of melonDS was a mere curiosity. It was inferior to the existing emulators. But that didn't matter, it was out there for the curious, and I was having fun with that project. The next releases would deliver quality.

The 3D renderer was my first attempt at a polygon rasterizer. I initially fumbled getting things like perspective correction working, but gradually gained more understanding of the process. Conducted tests to try understanding how the actual GPU did things and which shortcuts it took. And it has been an interesting ride.

I guess it turned out okay. This renderer has become the most accurate among all DS emulators. I thought, since the 2D renderer is going to be pixel-perfect, why not make 3D pixel-perfect too? I'm not there yet though, but the renderer is more than good enough for most games. Not good enough for the aging cart tests (which demand perfection to the pixel), but I put that on hold for a while.

On the other hand, the renderer code evolved as I discovered things, and is probably not as clean as it could be. In long term, it could benefit from a rewrite.

And this shit has been going on for one year now. Time flies, it's crazy.

melonDS became something serious. Not only does it have a pretty solid emulator core, it even does things no other emulator does. Be it very niche things like properly emulating the mainmem display FIFO, or emulating enough of the wifi system for local multiplayer to be possible.

Of course, it's not perfect.

Local wifi suffers from data loss every now and then, and some games just refuse to connect at all.

CPU timings are still grossly inaccurate. It's a shitty zone, especially on the ARM9, where the caches can affect performance radically. There are basically two unappealing choices: emulating the whole MPU/cache system (you can figure the impact on emulator performance), or guessing the timings in a grossly inaccurate fashion.

Sound has its lot of issues too, and I have yet to figure those out. Surround modes don't work well, when they work at all (generally they don't work). There are a couple issues with sound quality, one coming from bad emulation and one coming from the fact that the frontend doesn't do any resampling -- it picks the output frequency the closest to the core's sampling rate, and hopes things will go right. And, in the DS, sound mixing and sampling is driven by the system clock, which results in a non-integer output frequency.

And, most notably, the UI is still pretty lacking in functionality. I intend to keep melonDS lightweight and straightforward, but regardless, some features can't hurt. This is being worked on since 0.5, but at a slow-ish pace.

Real life has caught up to me. I had a job, but it ended two months ago, and I need to find something else. There are also other projects I've been rather busy into, lowering the amount of time I spend on the internet.

And... things I recently discovered about myself. Actions I want to take in that regard, but nothing is going to be easy.

So, even if the pace has slowed down, melonDS isn't dying.

Here's a glimpse of what's being worked on:

Rotation, for that occasional game that asks you to hold your DS sideways.

But there are also fancy screen dispositions:

• Natural: screens are arranged as they are on a DS.
• Vertical: screens are stacked on top of eachother (regardless of rotation).
• Horizontal: screens are placed next to eachother, top screen first.

Different sizing modes are provided too:

• Even: the screens are given the same size.
• Emphasize top: the bottom screen is kept at native resolution, the top screen is given all the remaining space.
• Emphasize bottom: vice-versa.
• Auto: emphasizes the screen which receives 3D graphics (typically the most important one), works well in NSMB for example. If 3D is displayed on both screens, behaves the same as Even.

There are still details to iron out, but so far those display settings are working rather well, and I like how they turned out.

Later on (probably after 0.6 at this rate), I want to work on non-local wifi, and some other things. For example, emulating the DS browser in melonDS and getting it to browse this site would be a cool feat. Can't tell how well it would do these days though, that browser was already subpar when it was released :P

There is also that Pokémon cart, which interests me. There are not a whole lot of players waiting to play that game, I guess, but it's interesting from a technical standpoint. The cart contains a Broadcom controller that seems to be quite capable, atleast enough to replace typical backup memory entirely. I just received a DS cart slot, so I'll be able to mess with it.

I am also very interested in that GBA-slot camera addon, but it's elusive as fuck. There is one game that comes with it (supposedly another game is compatible, but when I ordered that game, I received it without a camera), but when I search for it, all I can find is the DSi version. I need the DS version, the DSi version uses the DSi's builtin camera and thus doesn't come with the camera addon.

So, if you happen to have that camera addon, or know where I can get one... I am very interested.

Upscaling would be another thing to work on, according to popular demand.

This will need a new renderer, likely using OpenGL or a separate well-optimized software renderer. This is a bit of an issue.

On one hand, the current 3D renderer is meant to be accurate, and by that, I mean replicating the hardware's low-level behavior. The GPU is pretty much tailored for the DS, so I'm not positive that such a renderer can be made to render at higher resolutions and give good results. Not to mention the speed impact from that: rendering at 2x IR requires filling 4 times more pixels, and so on. The growth of performance requirement is quadratic.

Besides, people who want upscaling will generally want improved graphics. Texture filtering and maybe texture upscaling, for example. Or actual antialiasing (as opposed to the faked edge antialiasing the DS does). Which the current renderer will not support, because the DS doesn't do that.

It would make sense to implement these improvements in a hardware renderer. And also be less costy.

However, don't forget that the DS GPU is quirky. Some things would be difficult, or even impossible, to emulate correctly with OpenGL. One of them is how the DS keeps track of polygon IDs per pixel, but does so separately for opaque and translucent pixels. Those affect rendering of translucent and shadow polygons, and edge marking. Replicating the whole behavior in OpenGL would require a 16-bit stencil buffer, which is way atypical. Or maybe some cool trickery.

Which reminds me of blargSNES and its hardware renderer. While all the trickery we had pulled to render SNES graphics on the 3DS GPU was definitely cool, it took its toll on performance.

Sure, the average computer GPU is ways more powerful than the 3DS, but... yeah.

There is also the bit where the DS happily mixes 2D and 3D.

Old consoles featured 2D tile engines. Newer consoles feature 3D GPUs. The DS is inbetween, and... features both. A unique situation.

Situation which gives us the following problem: if we upscale 3D graphics, what should we do to 2D graphics?

And the answer pretty much depends on what the game draws. Something like NSMB could use a xBR filter, for example. But these filters don't work well on more complex graphics.

There is also the whole issue that the 2D renderer in melonDS is scanline-based. I have a good idea to implement upscaling in a non-intrusive way, but it will be an issue as far as filters are concerned. The 2D layers are added to the final framebuffer as they're rendered (rendering them separately would be slower). So, while the 3D layer is rendered separately and can have shit done to it, this doesn't apply to the 2D layers.

So most likely the 2D layers would just not be filtered at all.

Which reminds me of display capture, which allows to write rendered graphics to VRAM. Captured images have to be at the internal resolution. Those can also be modified by software before being displayed.

Which is less than ideal. For example, display capture can be used to display 3D graphics on both screens. While one screen will be rendering a 3D frame, the other will be rendering a captured version of its previous frame. The captured 3D frame is already degraded from 18-bit color to 15-bit due to how the hardware works. Throw upscaling in the mix and the captured 3D frame ends up being a downscaled version of the upscaled 3D frame, which would be re-upscaled with no filtering as it's displayed by the 2D engine. The quality loss would likely be visible and result in some bad flickering.

There would be ways to get around it, but it's neither trivial nor error-proof (see above-- the captured frame can very well be modified by software. There are also several ways a captured frame can be displayed, some games use large sprites).

The best solution in this kind of situation would be to disable upscaling entirely.

Oh well. Time will tell where this goes.
melonDS 0.5, finally!
Yup, there it is.

It's a recurrent theme in my projects that UI shenanigans are something I have trouble with. It's not that hard to make a UI, but making it cross-platform is another deal entirely. I want to avoid ending up with a different, separate UI per platform.

In the end, I went with libui, which is small and lightweight. I modified it to fit my needs.

The decision took a while though, and is one of the reasons why the release took so long to happen. Other reasons being, well, real life. My previous job is over, and hello job hunting again.

So this release features a new UI. It's not too new, but it removes the unneeded windows. Menus and video output are in the same window now, and the console window (the black window that spews nonsense) will be absent from release builds.

You can also drag-drop ROMs onto the window to run them. It is also possible to run ROMs via command line, but ATM when doing so melonDS will attempt to look for BIOS/firmware/INI in the directory the ROM is in.

If you play with your keyboard, you will need to remap your input, as different key codes are used (raw scancodes vs SDL scancodes).

Other than that, there are not a whole lot of changes emulation-wise. A few fixes, and the addition of antialiasing, as mentioned in the previous post.

Regardless, have fun.

Windows 64-bit
Linux 64-bit

The new UI library is incompatible with Windows XP or Vista, so there will be no such builds.

melonDS Patreon
This is indeed what has been worked on lately, so congrats to those who guessed right :P

As well as finding out that my edge slope functions weren't perfect. I tried, with moderate success, to make them more accurate, but they're still not perfect. So for now, I need to let it cool down. I decided I would make antialiasing 'good enough', then start working on the UI. Considering that there are other areas of the GPU that aren't perfect, like polygon clipping.

So here's a couple screenshots:

I picked cases where the result is quite visible. Antialiasing generally makes things look better, but whether it is that visible depends on what's being rendered.

Antialiasing may look like one of those minor details that are unimportant. But I consider it important to emulate, past the effect of making things look nicer: the way it works on the DS is very different from common antialiasing techniques. If you're into DS homebrew, you can't just turn on antialiasing to magically make things look better.

To begin with, antialiasing is applied to polygon edges based on their slopes, in order to make the edges look smoother. There is no antialiasing applied inside polygons or at intersections between two polygons.

Antialiasing also doesn't apply to translucent polygons. But actually, it's more complex than that -- the polygon fill rules depend on whether the individual pixel being rendered is translucent. So if a polygon has opaque and translucent pixels at its edges, the opaque parts will be antialiased and the translucent parts won't be.

The effect was shown in the following dasShiny screenshot. Note that dasShiny only has partial antialiasing support, it is only applied to Y-major edges.

More importantly, antialiasing interferes with wireframe polygons, lines and points. Wireframes are drawn as if they were filled polygons, ie. only the outer side gets antialiased. Lines are similarly antialiased on one side only. Points are said to disappear entirely.

Edge marking is also affected in that marked edges are made translucent when antialiasing is enabled. No idea whether it was intended, but it needs to be taken into account.

The real gold is how the hardware is designed to ensure antialiased edges are always rendered properly. The design is quite atypical and inspired from how the 2D GPU does blending.

If antialiased edges were immediately blended against the pixels underneath, it could cause visible glitches if another polygon is later rendered behind an antialiased polygon. So instead, per-pixel edge coverages are put aside for later: antialiasing is a separate rendering pass. It is the final pass, applied after edge marking and fog.

That final pass needs to know what the colors under the polygon edges are. Thus, the GPU keeps track of the last two colors rendered. When rendering an opaque polygon, the existing color is pushed down, and the new polygon color is written at the top. This is the same design that is used by the 2D GPU for its blending.

But if you render a polygon behind another, that polygon's pixels can be inserted behind the existing polygon, effectively replacing the existing bottom-most pixels.

The limitation is that this only works for the topmost edge pixels. If you draw two identical polygons A and B at the same position, with A on the top, A's edges are blended against B, and B isn't antialiased.

Things are funkier when translucency is involved. As far as I understand, translucent polygons don't push the existing topmost pixel down, but they blend with both topmost and bottom-most pixels. Translucent pixels that are behind the topmost pixel can still be blended with the bottom-most pixel in they come in front of it.

melonDS doesn't emulate this bit yet, but I will have to rework the renderer code at this point to make things nicer to work with.

Similarly, fog is applied to both topmost and bottom-most pixels, so that antialiased edges are still fogged properly. This part is emulated but at the expense of some duplicate code.

I'm not going to cover in detail how edge pixel coverages are calculated, because I'm still not quite sure. I haven't even gotten the edge functions perfect yet. This GPU isn't done being weird, heh.

But enough GPU work for now. There are still many other areas that need work. And I really want to put out a release with a nice UI.

As a side note: much of the research was initially done by Cydrak, and dasShiny was the first to attempt antialiasing support (albeit incomplete) (not counting DeSmuME's option to use OpenGL AA).
Sneak peek
The 3D renderer is being worked on, and for example, this is in the works:

I'll let you guess ;)

Work is also being put into figuring out the exact GPU algorithms. What has been done so far is a double bonus: not only is the renderer more accurate, it's also faster as a result of doing less divisions per pixel. The GPU takes shortcuts, so figuring them out allows for this kind of double-bonus optimization.

For example:

The buttons have dents in their borders, that are also visible on hardware. Those are caused by interpolation quirks. Artificially perfect interpolation will make the buttons look "perfect".

It's a tiny detail that likely doesn't matter to a whole lot of gamers, but as I said, I like figuring out the logic behind what I'm observing :P

Besides, it can't be bad to have an accurate renderer. You never know when a homebrew dev or ROM hacker may run into GPU quirks, but they may find about those more easily if they can reproduce the quirks on an emulator.

Or you have those gamers who want the original experience, too :P

Regardless, the next thing that will be worked on after this is... the UI. This will be a surprise :)
So things have been quiet lately. My apologies for that, and I can assure you that melonDS isn't dead.

First thing is that real life is striking back. My current job is ending at the end of this month, and I need to finish their project (fresh new website) and put it live. Most of it is done already, but it takes time to check everything and ensure it's alright and looks nice and all. So this occupies my mind more.

I'm also busy with parts of the melonDS adventure that don't produce a whole lot of visual results. For one, I'm investigating interpolation, and once again we're into weird land.

Z interpolation, for example: the depth buffer is 24-bit, but in Z-buffering mode, interpolation suffers from precision loss at various stages. The output precision actually depends on the value range being interpolated over (the greater the difference, the bigger the precision loss).

W-buffering doesn't have these issues as it uses untransformed W values, and uses the regular perspective-correct interpolation path (while Z-buffering requires using special linear interpolation as transformed Z values are already linear into screen space).

Regular perspective-correct interpolation is weird too. It seems to apply correction to W values based on their first bit. I don't quite see what was intended there, if it was intended at all -- after all, it could just be a hardware glitch. But regardless, that causes slight differences that can have visible effects. It's generally a texture being one pixel off, so one would say it doesn't really matter, but I like figuring out the logic behind what I'm observing.

I also want to finally tackle the UI. melonDS has become a fairly solid emulator, but the current UI doesn't really reflect that.

I still haven't picked something to go with, though. I think I'll pick something lightweight, like byuu's hiro, and modify it to interoperate nicely with SDL. I'm not sure how well that can work across platforms, but ideally I would create a SDL window and add a menubar to it, rather than having two separate windows.

I'm also thinking about a buildbot type thing. People wouldn't have to wait for proper releases, but on the other hand, I fear it reduces or kills my incentive to do proper releases.
Why there is no 32-bit build of melonDS
The main reason is that I don't have an incentive to provide 32-bit builds. Most people already have 64-bit OSes.

That being said, melonDS can currently run on 32-bit platforms. It may be less performant, as the 3D renderer does a lot of 64-bit math, but it is still possible.

But if I ever decide to implement a JIT, for example, there will be no 32-bit version of it.

If you're stuck on a 32-bit OS for hardware reasons, your computer will not be fast enough to run melonDS at playable speeds.

melonDS will be optimized, it will run faster, but it will also tend towards more accuracy. So I can't tell how fast it will be in the end. But I highly doubt it will run well on a PC from 2004. Maybe it will, if a JIT is made, but that's not a high priority task.

If you are stuck on such hardware, NO$GBA is a better choice for you. Or NeonDS if you don't mind lacking sound. Or hell, the commonly mentioned method of running DraStic in an Android emulator -- those who bash DeSmuME at every turn claim it's fast.

Truth is, emulating the DS is not a walk in the park. People tend to assume it should be easy to emulate fast because the main CPU is clocked at a measly 66MHz. Let's see:

There are two CPUs. ARM9 and ARM7, 66MHz and 33MHz respectively. Which means you need to keep them more or less in sync. Each time you stop emulating one CPU to go emulate the other (or to emulate any other hardware) impacts performance negatively, but synchronizing too loosely (not enough) can cause games to break. So you need to find the right compromise.

The ARM7 generally handles tasks like audio playback, low-level wifi access, and accessing some other peripherals (power management, firmware FLASH...). All commercial games use the same ARM7 program, because Nintendo never provided another one or allowed game devs to write their own. This means that in theory the ARM7 could be emulated in HLE. In practice, this has never been attempted, unless DraStic happens to do it. It's also worth noting that it would be incompatible with homebrew, since they don't use Nintendo's ARM7 program.

If it is possible to get ARM7 timings reasonably accurate without too much effort, the ARM9 is another deal. It is clocked at 66MHz, but the bus speed is 33MHz, so the CPU needs to adjust to that when accessing external memory. Accesses to main RAM are slow, moreso than on the ARM7, due to what appears to be bad hardware design. But the ARM9 has caches which can attenuate the problem (provided the program is built right). When the caches are used, timings can vary dramatically between cache hits and cache misses.

So emulating ARM9 timings is a choice between two unappealing options: 1) emulating the whole memory protection unit and caches, or 2) staying grossly inaccurate. I went with the second option with melonDS, but I'm considering attempting MPU/cache emulation to see how much it would affect performance.

Noting that RaymanDS is an example of timing-sensitive game: when timings are bad enough, it will start doing weird things. Effects get worse as timings are worse, and can range from text letters occasionally jumping out of place to travellings going haywire. I have observed polygons jumping out in melonDS, so the timings aren't good enough for this game.

And then you have all sorts of hardware attached to the CPUs. Timers, DMA channels, or the video hardware-- oldschool 2D tile engines and a primitive, quirky custom 3D GPU. The 2D GPUs need to be emulated atleast once per scanline as games can do scanline effects by changing registers midframe (it is less common than on the SNES for example, but it's still a thing). The 3D GPU is another deal: geometry is submitted to it by feeding commands to a FIFO. You need to take care of running commands often enough to avoid the FIFO getting full, especially as games often use DMA to fill it.

Oh by the way, the 2D GPUs are actually pretty complex, supporting a variety of BG modes and sizes, multiple ways to access graphics for sprites, and so on. The 3D GPU isn't any better, I think we have already established that it's a pile of quirks.

So yeah, with the sheer amount of hardware that must be emulated, the DS isn't a piece of cake.
melonDS 0.4 -- It's here, finally!

melonDS 0.4 was long awaited, and finally, it's here!

So here's a quick rundown of the changes since 0.3. I'm keeping the best for the end.

The infamous boxtest bug that plagued several games has finally been fixed. The bug generally resulted in missing graphics.

The boxtest feature of the DS lets you determine whether a given box is within the view volume. Several games use it to avoid submitting geometry that would end up completely offscreen. A prime example would be Nanostray, which uses it for everything, including 2D menu elements that are always visible.

Technically, you send XYZ coordinates and sizes to the GPU, which calculates the box vertices from that. The box faces are then transformed and clipped like regular geometry, and the test returns true if any geometry makes it through the process (which would mean that it appears onscreen). This also means that the result will be false if the view volume is entirely contained within the box.

I had no idea where the bug was, as melonDS did all that correctly, and some tests with the libnds BoxTest sample revealed nothing. It turned out that the issue lied within the initial calculation of the box coordinates. When melonDS calculated, say, "X + width", it did so with 32-bit precision. However, the hardware calculates it with 16-bit precision, so if the result overflows, it just gets truncated. And, surprise, games tend to rely on that behavior. Getting it wrong means you end up testing a different box, and getting different results. Hence the bug.

There have been various other improvements to the 3D renderer. Things have been revised to be closer to the hardware.

As a result, the Pokémon games look nicer, they don't have those random black dots/lines all over the place anymore. The horrendous Z-fighting observed in Black/White is also gone.

Other games that suffered from random dots/lines around things, should be fixed too.

As well as things like wrong layering in UI screens rendered with 3D polygons.

The 2D renderer got its share of fixes too. Mainly related to bitmap BGs, but also a silly copypaste bug where rotscaled bitmap sprites didn't show up.

But there have also been improvements to an obscure feature: the main memory display FIFO. If you happen to remember:
The display FIFO is another obscure feature -- I don't know of any retail game that uses it, but I have yet to be surprised.

And, when debugging rendering issues in Splinter Cell, I have been surprised -- that game does use the display FIFO when using the thermal/night vision modes.

The issue in that game was unrelated to that feature (it was bad ordering of rendering and capture that caused flickering), but it was a good occasion to improve display FIFO emulation. It atleast allowed me to finally axe the last DMA hack as the associated DMA now works like on hardware. The FIFO is also sampled to a scanline-sized buffer with the same level of accuracy.

This doesn't matter when the FIFO is fed via DMA, but it enables some creative use of the feature -- for example, you can write 16 pixels to the FIFO to render a vertical stripe pattern on the whole screen. Or you will get bad results should you try to use it the wrong way. All of which is similar to what happens on hardware.

Pushing for accuracy, the last few big things that magically happened instantly now have their proper delays emulated, namely SPI transfers (including backup memory SPI) and hardware division/sqrt.

Firmware write was implemented, meaning you can change the firmware settings from the firmware itself (to run the firmware with no game: System -> Run). A backup of your firmware image will be made should anythig go wrong.

And, last but not least: working wifi multiplayer. It was already spoiled, but regardless, it's the first time DS emulation gets this far. And it doesn't require an unofficial build! It's there and ready to use.

It's not perfect though. But it's a start. Pictochat, NSMB and Pokémon are known to be working, but you might encounter spurious disconnects (or, more likely, lag). Mario Kart and Rayman RR2 refuse to work.

I added a setting ("Wifi: bind socket to any address") which you can play with to try making wifi work better. It can make things better or worse. Try it out if needed. Leaving it unchecked should be optimal, but I was told that it doesn't work under Linux.

With all this, no work was done on the UI, despite what I had stated. My apologies for this. But the UI will be worked on, I promise!

You can find melonDS 0.4 on the downloads page.

Patreon for melonDS if you're feeling generous.
Nightmare in viewport street
And another bug bites the dust... a quite old one, by the way.

Namely, it was reported two months ago in issue #18: missing character models in Homie Rollerz's character select screen. Actually, if you look closely, you can see they were there, but they got compressed to one scanline at the top, which was caused by a bug in the viewport transform.

In 3D graphics terms, the viewport defines how normalized device coordinates of polygons are transformed to screen coordinates, which can be used to render the polygons.

Most games specify a standard fullscreen viewport, but there are games that pull tricks. Homie Rollerz is one of them, the character select screen uses a 'bad' viewport. But, unlike high-level graphics APIs, the DS has no concept of bad viewport. You can input whatever viewport coordinates, it doesn't reject them or correct them.

So how does it behave? Well, if you've been following this, you surely know that the DS looks normal and friendly on the surface, but if you look deeper, you find out that everything is weird and quirky. Viewports are no exception.

That's why the bug stayed unfixed for so long. GBAtek doesn't document these cases, so it's a matter of running hardware tests, observing results, and doing it again and again until you figure out the logic.

For example, here's a test: the viewport is set to range from 64,0 to 128,192. Nothing special there.

shitty triangle

Now, we change the viewport to range from 192,0 to 128,192, which results in a width of -64 (which, by the way, OpenGL would reject). One would say that such a viewport results in graphics getting mirrored, like this:

shitty triangle but mirrored

It looks correct, but that's not what the hardware outputs. In reality, it looks more like this:

shitty triangle stretched

The GPU doesn't support negative screen coordinates or viewport sizes, so they wrap around to large positive values. The range for X coordinates is 0-511, so for example trying to specify a viewport width of -1 results in a 511 pixel wide viewport. Such values are likely to cause the viewport transform calculations to overflow, causing coordinates to wrap around in a similar fashion. This is what happens in the example above -- the green vertex actually goes above 511.

Y coordinates work in the same way but with a range of 0-255. However, one has to take into account the fact that they're reversed -- the Y axis for 3D graphics goes upwards, but the 3D scene is rendered from top to bottom. The viewport is also specified upside-down. Different ways of reversing the screen coordinates give slightly different results, so I had to take a while to figure out the proper way. Y coordinates are reversed before the viewport transform, by inverting their sign. Viewport Y coordinates are reversed too, as such:

finalY0 = 191 - Y1;
finalY1 = 191 - Y0;

Past that, there are no special cases -- the viewport transform is the same, no matter the viewport coordinates and vertices. The only special case is that polygons with any Y coordinate greater than 192 aren't rendered (but they are still stored in vertex/polygon RAM). This is probably because the hardware can't figure out whether the coordinate should have been negative (as said before, it doesn't support negative screen coordinates), and thus can't guess what the correct result should be, so it's taking the easy way out. But this is a bit odd considering it has no problem with X coordinates going beyond 256.

With that, I'm not sure how coordinates should be scaled should 3D upscaling be implemented. But, I have a few ideas. We'll see.

But regardless, after revising viewport handling according to these findings, the Homie Rollerz character select screen Just Works™, and I haven't observed any regression, so it's all good.

Also, release 0.4 is for very soon.