Have audio pitch change with speed? (like in Dolphin, mGBA, bsnes, and Duckstation)

Dolphin, mGBA, bsnes, and Duckstation all have the ability to change the audio pitch in accordance to the emulation speed - that is, if you set emulation speed to 50%, then the audio will play back at half of the pitch (heck, at least for Dolphin, it even has both the “pitch matches game speed” as well as an “audio stretching” (pitch correction) option).

So, simply put, is there a way to have this sort of “audio pitch matches game speed” in yuzu?

Now I don’t know much about software-dev but, at least in the audio world, if you have a 10kHz-pitch sine wave sampled at 48kHz and play it back with a sample rate of 24kHz aka 50% speed, the end result is a bog-standard 5kHz-pitch sine wave.

I say this because Yuzu seems to instead automatically be applying some sort of audio-stretching and/or pitch-correction which, at least in the audio world, actually requires more work than if you just left the waveform as-is playing back at whatever speed. I mean, it’s my impression that both mGBA and bsnes support only “audio pitch matches game speed” and don’t even support audio stretching/pitch correction which points to it actually taking more work to implement audio stretching/pitch correction than to not.

In order for us to provide better support, we need to see the log generated by yuzu. This guide will walk you through how you can obtain the log file: Getting Log Files - yuzu

This may be a limitation caused by keeping threads in sync with multicore. Let me ask around.

We used to adjust audio speed according to emulation speed. But sound quality was terrible if you PC just had a little lag. Since most switch titles are pretty demanding it’s just better to keep the bit rate the same all the time.

Your reply is very interesting since, outside of Yuzu, I find it to be just the opposite - having audio “stretched” / “pitch corrected” to me makes the audio sound terrible when running at less-than-native speeds. Heck, Dolphin even defaults to having the audio match the game speed and the “time steteching” / pitch correction is actually the alternative, non-default setting (which admittedly can make more sense for faster-than-native speeds if running a much higher speeds like 1.5x/90fps)

Conversely, having the audio pitch match the game speed sounds perfectly fine fidelity-wise in the aforementioned emulators, albeit at a lower pitch if you can at least lock to a lower frame rate (e.g. 0.75x/45fps, 0.667x/40fps, 0.5x/30fps, or various fractions of your refresh rate like 0.625x/37.5fps on 75Hz). On Dolphin this function is what I commonly use for fast-paced racing games (F-Zero GX, Fast Racing League) that I want to play with other people who can’t really handle the full speed of the games (I’m not kidding) as this is something you can really only properly do via emulation, and taking advantage of this for games like Fast RMX would be very nice (especially since, it being fast-paced, means that running at a slower speed will not make the game itself feel sluggish). Heck I’ve even used the Wii virtual console version of F-Zero X for this purpose because I didn’t know of any N64 emulator that had aforementioned function.

I imagine the only issue is that having the pitch wildly varying from high to low with an uncapped framerate sounds musically terrible which is what I’m guessing that you meant by “sound quality was terrible” (though “quality” implies fidelity and I would expect that fidelity would actually sound better without the time stetching/pitch correction…), but my interest was with regards to capping at a lower frame-rate anyway.

We don’t stretch or modify audio. We play the original sound file as it is. This means that regardless of the fps the sound is exactly the same. The only issue is on time critical games like Voez where the music often ends earlier than intended.

This is also a very interesting response!

Perhaps it’s my background of working with audio waveforms that is causing confusion as it’s starting to sound like things may work very differently between PCM audio waveforms and emulation audio output, even when dealing with streamed audio (e.g. my aforementioned F-Zero GX example) rather than sampled audio (e.g. my aforementioned F-Zero X example…though even sample audio in Dolphin gets down-pitched when the emulator is ran at slower speeds).

At least when dealing with PCM audio waveforms, changing the playback time e.g. “how many samples are played within a second” automatically changes the pitch because pitch is determined by the amount of samples played within a second.

A program like Audacity can demonstrate what I mean, or the ancient but simple tool “Header Investigater” can also demonstrate this - you input a WAV file, set the sample rate to something else and, when you play back the resulting WAV file, you’ll notice the pitch is different. But if you set the sample rate back to the original and compare to the actual original file, the waveform is 100% identical and no resampling is occurring (the file checksum may even be identical! also you can tell that no actual audio processing e.g. resampling is occurring because the sample rate-changing process is stupid-fast even on very low-end hardware like a 1st gen single core 1.6GHz Intel Atom - note the XP-era screenshots and the EXE’s ‘modified’ timestamp of 2003)

Conversely, changing the playback time of a PCM waveform but keeping the pitch the same requires applying a secondary “pitch-correction” process which, especially at slower-than-original speeds, rarely ever sounds all that good.

I noticed that the latest progress report makes mention that the audio system in yuzu is supposedly being reworked.

I don’t suppose that this re-working of yuzu’s audio will allow this pitch-changes-with-speed function to be much more feasible to have? I know you mentioned performance reasons but, as I mentioned, my main interest is using it to run games at slower speeds anyway, so that by definition would mean that you would not need as much performance.

(to be clear, I figure any such “pitch-changes-with-speed” function would be a toggle-able setting just like it is in Dolphin so that you can still have the audio pitch not change with speed if you so choose)

…also it’s looking like both AMD and Intel are going to continue providing substantial gen-over-gen performance improvements with each subsequent CPU generation, so I question if the performance concerns will even be a concern anymore in just a couple of CPU generations - the early/mid 2010s this is not looking to be! (Haswell’s unforseen substantial improvement in emulation performance notwithstanding)

I just realized that I never replied to this statement. Perhaps you already realize this, but it should probably be stated somewhere in this thread that you can’t even set a custom emulation speed limit currently unless you disable multicore.

It’s a subject under debate still. We plan to improve kernel enough to either allow the speed limiter to work again, or remove the option entirely, depends of how feasible it is.