|
| Home | Downloads | Screenshots | Forums | Source code | RSS | Donate |
| Register | Log in |
| < Fixes, and future of melonDSNew direction for melonDS > |
|
Chronicles of Timings: Tales of Destruction Mar 16th 2026, by Arisotura |
|
Ah, timings. The infamous Horseman of Timings. This post is going to be about a decision I'm taking for melonDS, but first, I'll write about my recent research regarding two old timing issues. I found it very interesting to dig into those games and figure out how they work. 1. Corrupted FMV audio in Over The Hedge - Hammy Goes Nuts Basically, you start a game, and you have a bit of a video introducing the plot of the game... except the audio is covered in high-pitched beeps and loud screeching. I first found that the game runs a video/audio decoder on the ARM9, which appears to have been largely written in ASM. The audio decoder writes its output in main RAM, where it gets sent to the audio hardware. The way it works is interesting. The audio buffer size is calculated based on the length of the audio track, the audio frequency, and other parameters. In this case, the length is 30 blocks, one block being 128 samples. Then, the decoder is run until the buffer is filled to a certain point (23 blocks). At this point, audio playback is started. The ARM7 also starts a periodic alarm which will notify the ARM9 every time one block worth of audio has been played. The alarm notification is used to run the audio decoder when needed. However, the audio decoder works in terms of bigger chunks: one chunk typically contains 6 blocks worth of audio. So the decoder attempts to process that many blocks in one go, but it stops processing new blocks when the output buffer is full. The catch is that, as far as I can tell, when blocks are skipped, there is no logic to compensate for that. The next time the audio decoder runs, it will start working on the next chunk no matter what. Since the decoding process relies not only on the current input but also the previous output, skipping blocks breaks it. And that's what is happening on melonDS. In this situation, the ARM9 needs to run slowly enough that it doesn't get too far ahead of audio playback, and doesn't fill the output buffer entirely. 2. Freeze during tutorials in Sonic Chronicles - The Dark Brotherhood This is another problematic one. This game uses interactive videos to explain new combat mechanics, but those seem to give emulators quite the trouble. On melonDS, the freeze has manifested in different ways as timings were changed in one way or another, but it has remained so far, even though the recent changes to cartridge emulation helped. The first observation was that the game crashes pretty badly: it jumps to a completely bogus address (the value of which is actually an ARM instruction), which means emulation derailed pretty bad. The crash comes from using an out-of-bounds index in a jump table. Looking at the code where this happens, it appears that we are, once again, dealing with hand-written ASM. It is a bizarre system with long chains of jumps and loops. It is likely related to the tutorial videos, some kind of script interpreter or decoder, but its exact function isn't relevant to the crash. The crash occurs because garbage is being fed into the system. After a bunch of backtracking, I was able to figure out what's going on. This whole system loads video scripts from the ROM and feeds them into the bizarre interpreter, where presumably they get turned into what appears onscreen. (I'm not sure if the term "scripts" is correct, but for the explanation's sake.) The two operations are staggered: the loader starts loading the next script from the ROM, and while that is underway, will execute the current script. To know where to load the next script, the loader needs to know how long the current script is. This is where the issue lies: the loader runs as soon as the previous script ends, and thus, may run while the current script is still being loaded, but doesn't check for that situation. It first determines the current script's length, then sets up the ROM transfer for the next script, which will wait for any previous transfer to finish. Due to this order of operations, there is a chance the loader will try to read the current script's length before it has been loaded. You guess how that goes. This is a very similar problem to the above one: the ARM9 needs to run slow enough to never get too far ahead of the ROM transfer. In fact, since the ARM9 is mostly busy writing to memory, this seems to require data cache emulation. There are other well-known timing issues, like the DSi Sound App crash, but I had already documented that one, so I won't do that again. Notice a common trend? Those all stem from shoddy programming. The bugs go unnoticed because things happen to work on hardware, by pure luck. The same code on a modern system would inevitably crash, leading to the bugs being found and fixed. But, unlike modern systems, the DS has very deterministic timings. So what do we do? We have been beating around the bush for too long here, and it's time to do something. Basically, to emulate the ARM9 caches. I was going to make an attempt at data cache emulation to see how it would go, when I was informed that there is already an old PR for cache emulation. (to DesperateProgrammer: I'm sorry!! should have dealt with this way sooner) I merged it into a separate branch, and fixed a couple bugs. So far, the results are encouraging. The first issue in this post seems to be entirely fixed. The second issue persists, but it's getting further in. The DSi Sound App is also functional. I think the cache logic itself is good, but there are other improvements to be done as far as ARM9 timings are concerned. It will take some reworking, as the way melonDS handles memory timings isn't great. Of course, there's also room for optimization, but let's not get ahead of ourselves - get things working first, optimize later. When I get this to a satisfactory state, I will compare performance against stock melonDS. I can see 3 possible profiles to compare: no cache emulation (stock melonDS), instruction cache only, both caches. Depending on how this turns out, there may or may not be options for different cache emulation profiles. For example, I could imagine instruction cache emulation becoming the baseline. Since instruction fetches are very predictable, the instruction cache is a minor performance hit. Data fetches, however, are far more unpredictable, so data cache emulation might hurt performance too much to be part of the baseline. We'll see. With this, we can hope to finally vanquish the Horseman of Timings. Well, not quite, there's still main RAM contention, but hopefully nothing beyond the DSi menu loader relies on that. |
| 6 comments have been posted. |
| < Fixes, and future of melonDSNew direction for melonDS > |
|
poudink says: Mar 16th 2026 |
| Jaklyy also has a pair of PRs that improve a lot of timings (2125 and 2235). Apparently the latter one fixes Sonic Chronicles (according to this comment) and even has main RAM contention (so it might even fix the DSi menu loader). Idk if the code is usable tho, since it's a draft, it apparently "slaughters performance" and it has a lot of "Known Issues". |
|
Idk says: Mar 16th 2026 |
| Hi |
|
Arisotura says: Mar 16th 2026 |
|
yeah, performance is the main concern there - halving melonDS's performance wouldn't be a very popular move. #2125 might be interesting, although some of it has been done separately in master already, so it'd need an update |
|
Regis says: Mar 16th 2026 |
|
Offtopic: I tried to read the article on my phone, but the text was too small. Could you modify your website so the font size automatically adapts to small screens? |
|
Geraldo7 says: Mar 16th 2026 |
| yoooo thank you for the update, please correct me if i read this wrong but in a future patch is there a realm in which that crackling/popping audio when speeding up will be fully fixed? Thanks in advance |
|
dreamsyntax says: Mar 20th 2026 |
| Good luck with this! Very exciting to see this being looked into after all this time. |