|Home | Downloads | Screenshots | Forums | Source code | RSS|
World of shit
Dec 30th 2018, by StapleButter
I've been looking at another 'timing renovation victim' issue: dual-screen 3D shitting itself in Colour Cross.
The game draws all its animated background tiles/etc separately, giving each its own transforms. It sends unpacked GXFIFO commands and doesn't use DMA. All in all, not very efficient, but probably not a bad way to do it given what they're doing.
Anyway, the game expects to take a certain time to draw each screen (for the bottom screen, it spans two frames). If the code runs too slow or too fast, it shits itself. Given how geometry is sent, GX timings don't affect this.
It doesn't have to be precise, but there's a window within which this has to fall, so we can't make it absurdly fast or absurdly slow and hope to fix it.
On the other hand, we have the Spellbound issue which I mentioned in the previous post. There, the code has to be running slow enough, I can't see any other way out. Both DMA and GX timings are correct there. I could always try dumping the display list and running it on hardware within the same conditions to see how it goes, but I'm sure the issue is with code timings.
So I guess this is it. No amount of hodgepodging timings will get us out of this, as I feared. And I really want to avoid 'solutions' like game-specific hacks or 'toggle this timing setting and see if it fixes your game'.
So, we have to emulate the ARM9 instruction cache.
On the plus side, since code fetches are mostly sequential, this shouldn't be a big performance penalty. The cache can be checked only upon branches and on cache line boundaries (on the DS, that is every 32 bytes). The ARM9 caches have 4-way associativity, which means a measly 4 cache lines to check before we know whether the current address is cached. So I believe we can afford to emulate it without killing performance.
The data cache would be the one killing performance; because data accesses are far less predictable, the cache would have to be checked upon every memory access. For now, it appears we can get away with not emulating it, so let's pray.
For less performant devices, we could always have a 'performance' profile that uses the old timing model (2 cycles per code fetch in cached memory, always). It's less exact and more likely to break something else, but it seems to run most things reasonably well, so it's a good candidate for a performance/accuracy compromise.
I still have to look into Spellbound, because of course making code fetches taking 2c doesn't fix it. It needs to be atleast 5c to work correctly, but 5c breaks other things, no surprise there.
Quick testing confirms what I suspected there, it only ran correctly in older melonDS versions because the GXFIFO DMA was too slow, taking 3c per word. On hardware, transferring from main RAM to IO registers (GXFIFO or whatever else) takes 2c per word, which is also the case in melonDS after the timing renovation. So, well, we can't take it back.
(the difference looks puny, but when transferring hundreds of units, it quickly adds up)
In brighter news, well, 0.7.3 will come out soon-ish, once I come up with something for the ARM9 code cache.
There are also a few improvements making it more user-friendly. For example, I made savemem relocation when loading savestates disabled by default, as I figure that having it enabled by default can be confusing.
We're also trying to fix the input config dialog crashes under Linux. I wasn't able to reproduce them because I wasn't looking at the right place: they happen when mapping joystick buttons, and taking that into consideration, I reproduced it, and could fix it. It was quite silly, reason is that we use a SDL timer callback to sense joystick input, and were updating the UI directly from that timer callback. Forgetting that GTK is not thread-safe and the UI shouldn't be manipulated outside of the UI thread, even after I ran into similar issues with the main window. Using uiQueueMain() lets us get around this and fixes the crash.
I'm still not 100% sure there though, there was a crash under FreeBSD that appeared to be different.
There's also a little improvement to joystick input, it can detect if the joystick is disconnected/reconnected while melonDS is running.
It's still limited to one joystick though, I'm not quite sure how to go about handling multiple joysticks and whether we can trust the OS not to go and change their indexes.
I also want to work on network support, adding the interface selector and built-in DHCP/NAT, but maybe for 0.7.4. I already have enough on my plate here, and that's without getting into the 0.8 hardware renderer.
|3 comments have been posted.|