|Home | Downloads | Screenshots | Forums | Source code | RSS|
I am fucking relieved
Dec 15th 2018, by StapleButter
You might already know that 0.7.1 sports all new timings and that it also causes a few issues that were not present in 0.7. Ranging from NSMB's cannons putting on some weight to Pokémon characters getting magical growth to, less amusingly, dual-screen 3D shitting itself in Etrian Odyssey.
The NSMB and Etrian Odyssey issues also had the fun aspect that they didn't occur in the USA ROMs, for some reason.
Well, that's a thing of the past now.
I knew the issues appeared after the timing renovation, but that was about it. I figured that instead of blindly hodgepodging timings until the issues disappeared, and potentially getting into some whack-a-mole game, I would take the time to understand the issues.
So the first one was the NSMB cannon.
Technically, the cannon body is a billboard (3D sprite that always faces the camera). The 'big' version is just the same but with the cannon body scaled too big.
So I investigate how the cannon body is rendered. For that, we can use NO$GBA's debugger version, atleast until melonDS gets similar tools :P
Conveniently, the cannon body is the first polygon in the display list. Before it gets rendered, there are some transforms set up so it will face the camera. In this case, the second-to-last scale transform sometimes got wrong values that caused the billboard to appear too big, hence the bug.
So I look into the game's code to find out where and how that scale transform is calculated. Noting that debugging NSMB is made way easy by the awesome folks at the NSMB Hacking Domain who have a complete IDA database of the game.
The game feeds some transform commands to the GX, waits until it's done, then reads the clip matrix (projection and position matrices product), and derives the scale transform from it. In our case, the bug came from how melonDS handled the GXSTAT busy flag: there was a thin chance that, after submitting some commands, the game could manage to read GXSTAT before the busy flag was actually set. Thus it would decide that the GX was already done, and proceed to read the clip matrix while the transforms were executing, and get some of the values wrong.
The issue was unrelated to CPU/memory timings, it just happened to be a simple bug that slow enough timings kept hidden, until now. So, it was easily fixed.
The Pokémon issue is likely the same issue, and is likely fixed too. We just need to check it.
The Etrian Odyssey issue, however is different.
Quick investigation showed that the game was almost never swapping its screens, despite attempting to do dual-screen 3D.
The DS can only render 3D to one screen at the time, so how do you do dual-screen 3D? With a bit of trickery. For example, you would render a 3D frame to the top screen, and at the same time, capture that frame to VRAM. Then, during the next frame, you swap the screens, so that you're now rendering 3D on the bottom screen, and on the top screen you render the bitmap you previously captured. With this typical method, you get 3D on both screens, but you're limited at 30 FPS.
So obviously, if the game is not swapping the screens, the rendering is only going to be a trainwreck.
So, we look into the code responsible for swapping the screens and making that whole thing work. It runs upon VBlank, and basically does the following:
1. wait ~400 cycles, by running a subs/bcs loop 200 times
2. check the GXSTAT busy flag; if that is set, give up
3. swap screens, do more setup
The base idea there was likely that if a frame took too long to render, the game would avoid swapping the screens at the wrong time and causing flickering.
In our case, however, the SWAP_BUFFERS command took a bit too long and accidentally triggered that.
It was set to take 392 cycles as per GBAtek's data, but measurements on hardware showed that that duration is more like 325 cycles. Revising our code accordingly, Etrian Odyssey finally rendered normally.
So, as the title says, I am fucking relieved there. I thought we were faced with some problem that could only be solved by emulating the ARM9 caches, which, urgl. It turned out to not be the case. And melonDS 0.7.2 is totally gonna rock!
|5 comments have been posted.|