|Home | Downloads | Screenshots | Forums | Source code | RSS | Donate|
|Register | Log in|
|< Tragic newsBuffing up DSi mode >|
Jul 22nd 2021, by Arisotura
Apologies for the slow Summer! We don't have air conditioners in the melonDS HQ. The current climate is causing the team to slowly melt.
Anyway, audio interpolation is one of the emulation improvements that have been requested for melonDS. My general policy for emulation improvements is that they should allow for keeping the accurate code paths, and they shouldn't add too much complexity to the code. Audio interpolation is well within these bounds. Actually, I had implemented it in DeSmuME back then, and due to the way DeSmuME's mixer works, it was quickly done.
So I figured I would give it a try in melonDS.
The basic idea behind audio interpolation is to smooth out the audio samples as they're being upsampled. DS games may have downsampled audio to save on space and bandwidth, and the DS mixer doesn't perform any interpolation, which can lead to rough sounding samples. The reason the DS does no interpolation is most likely due to how its mixer hardware works, but obviously as an emulator we can ignore these constraints and do a better job.
It's also noting that, as far as melonDS is concerned, there are two parts we need to take care of: the DS mixer and the audio output.
In the DS, the mixer is driven by the system clock, like nearly everything else. If you ever coded for the DS, you might have wondered why the frequency registers for the audio channels are weird:
40004x8h - NDS7 - SOUNDxTMR - Sound Channel X Timer Register (W)
Bit0-15 Timer Value, Sample frequency, timerval=-(33513982Hz/2)/freq
The PSG Duty Cycles are composed of eight "samples", and so, the frequency for Rectangular Wave is 1/8th of the selected sample frequency.
For PSG Noise, the noise frequency is equal to the sample frequency.
The SOUNDxTMR registers directly control the channel timers, which are driven at half the system clock. These work like the general purpose timers: they are incremented at half the system clock, and every time they overflow, they are reloaded to the SOUNDxTMR value and the channel advances to the next sample.
This is a fairly simple and efficient design, but you can probably guess why it doesn't lend itself to interpolation. Basically, to get the sub-sample position you need for interpolation at any given time, you would need to subtract the current timer value from the reload value, then divide that by 0x10000 minus the reload value, which isn't convenient to implement in hardware.
The mixer in melonDS works in a similar way, although it is only sample-accurate, for several reasons: sample accuracy is good enough for DS games, we don't know how the mixer operates on a per-cycle basis, and of course, performance reasons. To reach its sample rate of approximately 32.7 KHz, the DS needs to output one audio sample every 1024 system-clock cycles, and that is how often we run the mixer in melonDS. We have to be a bit smart about updating our channel timers, but it works well enough.
However, this design means the output sample rate of the melonDS core depends on how fast it's running. Basically, melonDS runs 560190 cycles per frame and outputs one audio sample every 1024 cycles, like the real thing. Assuming a framerate of 60 FPS (which is a bit faster than the real thing), this means an audio output rate of 32823.6328125 Hz.
Well, yeah. Generally, you can't go and ask your audio library for a weird non-integer sample rate.
So what do we do, here? Well, early melonDS versions would just pick the closest integer sample rate, send out the audio output as-is, and pray. You guess, it didn't work that well. Not only was it impossible to attain perfect sync, but on some platforms we just could not get a sample rate of 32824 Hz.
Hence, a proper audio output stage was added. It lets us pick a more standard output rate of 48 KHz, lets the audio driver give us another sample rate if that one isn't available, then it resamples melonDS's audio output to match that output rate. The resampler also supports a small margin, which can make up for small variations in framerate.
This resampler would be another point of concern: currently, it upsamples audio with no interpolation, so there's room for improvement here too.
Anyway, I made a quick proof-of-concept in a separate branch. For now, it applies linear interpolation to all channels, and seems to work decently well. A few notes on this:
1. PSG channels are quite muffled. They should not be interpolated, but I'm partly tempted to keep that as a fun option.
2. Linear interpolation is the easiest but certainly not the best. I could implement better algorithms: cosine, cubic, gauss...
3. Of course, the feature would be made optional, and disabled by default.
I might also add an option for interpolation in the resampler, or keep the two tied together for simplicity? Not sure. Noting that interpolation makes things sound smoother but can also muffle sound to an extent. Your input is welcome!
|35 comments have been posted.|
|< Tragic newsBuffing up DSi mode >|