A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems
With this workaround still being applied to even modern AMD systems, K Prateek Nayak discovered: “Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases. One such workload is tbench where a massive performance degradation can be observed during certain runs.”
At least for Tbench, this long-time, unconditional workaround in the Linux kernel has been hurting AMD Ryzen / Threadripper / EPYC performance in select workloads. This workaround hasn’t affected modern Intel systems since those newer Intel platforms use the alternative MWAIT-based intel_idle driver code path instead. The AMD patch evolved into this patch by Intel Linux engineer Dave Hansen. That patch to limit the “dummy wait” workaround to old systems is already queued into TIP’s x86/urgent branch. With it going the route of “x86/urgent” and for fixing a overzealous workaround that isn’t needed on modern hardware, it’s likely this patch will be submitted this week still for the Linux 6.0 kernel rather than needing to wait until the next (v6.1) merge window.
Read more of this story at Slashdot.