MuQSS and Wayland = audio glitches
Tomoms
Status: Curious
Joined: 20 Mar 2020
Posts: 5
Reply Quote
Hi everybody, I'm an openSUSE Tumbleweed user and I run a self built kernel that basically consists in openSUSE's default kernel + all of zen patches from the various branches of the zen-kernel repo. Starting from the Linux 5.3 release, if I recall correctly, I've faced a weird issue that still persists to this day: if the kernel is built enabling MuQSS, and if I log into a KDE Plasma+Wayland session, the audio output is very glitchy and it often skips and plays a very short snippet of some sound that was played a few seconds before the glitch. Seems very similar to what is described in the first paragraph of this Reddit post: [link], except that the glitches are very very frequent and seem to be triggered by any cpu load spike (even the smallest one), because they often occur when opening a program, when switching workspaces, when clicking on window titlebars, basically, whenever I interact with the desktop environment.
This doesn't happen if I apply all of zen's patches leaving out MuQSS. Kernels built using MuQSS don't cause this problem if I use Xorg. This seems to be a very weird bug and I'm not sure if you guys are the ones to whom I should be reporting it, but anyway, I hope someone can give me their opinion on this.
Thanks :)[/url]
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1122
Reply Quote
Are you using a port for Liquorix on OpenSuse or a stock kernel + zen patches? To take advantage of MuQSS properly, you'll want:

    High resolution timeouts - this escapes the HZ granularity of the kernel tick
    Forced Threaded IRQS - moves scheduling of IRQs to MuQSS

And in userspace, you'll want pulseaudio to start with the highest priority your system lets you. If you can get a niceness of -19, that's preferred. If you can get pulseaudio to start with SCHED_FIFO or SCHED_RR, that's even better.

You can from your terminal have pulseaudio run with isochronous scheduling by executing schedtool -I $(pidof pulseaudio). This can be done from your user without escalating privileges, and gets the same behavior as realtime processing without the chance of completely destroying your system with a runaway process.

Lastly, run your system with performance mode in cpufreq. A lot of distributions love to start systems with ondemand or intel-pstate's powersave, despite the kernel's preferred configuration to run in performance mode. However, MuQSS schedules processes in a way that prevents these governors from properly ramping up CPU frequency. There's a chance you're running at your processor's lowest frequency and your CPU at that speed is not capable of filling your audio buffers while multitasking. See if running your processor at max speed fixes the problem.

Can you verify that any of the above fixes your audio xruns?

EDIT: One other thing you can try considering you're running a stock kernel with zen patches + muqss, try playing with the yield value in /proc/sys/kernel/yield_type. It could be that Wayland is doing something nasty with yield (an ambiguous function call), and changing how yield behaves solves your problem.
Back to top
Tomoms
Status: Curious
Joined: 20 Mar 2020
Posts: 5
Reply Quote
:: damentz wrote ::
Are you using a port for Liquorix on OpenSuse or a stock kernel + zen patches? To take advantage of MuQSS properly, you'll want:

    High resolution timeouts - this escapes the HZ granularity of the kernel tick
    Forced Threaded IRQS - moves scheduling of IRQs to MuQSS

And in userspace, you'll want pulseaudio to start with the highest priority your system lets you. If you can get a niceness of -19, that's preferred. If you can get pulseaudio to start with SCHED_FIFO or SCHED_RR, that's even better.

You can from your terminal have pulseaudio run with isochronous scheduling by executing schedtool -I $(pidof pulseaudio). This can be done from your user without escalating privileges, and gets the same behavior as realtime processing without the chance of completely destroying your system with a runaway process.

Lastly, run your system with performance mode in cpufreq. A lot of distributions love to start systems with ondemand or intel-pstate's powersave, despite the kernel's preferred configuration to run in performance mode. However, MuQSS schedules processes in a way that prevents these governors from properly ramping up CPU frequency. There's a chance you're running at your processor's lowest frequency and your CPU at that speed is not capable of filling your audio buffers while multitasking. See if running your processor at max speed fixes the problem.

Can you verify that any of the above fixes your audio xruns?

EDIT: One other thing you can try considering you're running a stock kernel with zen patches + muqss, try playing with the yield value in /proc/sys/kernel/yield_type. It could be that Wayland is doing something nasty with yield (an ambiguous function call), and changing how yield behaves solves your problem.

Thanks for the tips. I'll try them and let you know if they are of any help.
Back to top
Latest news on this issue
Tomoms
Status: Curious
Joined: 20 Mar 2020
Posts: 5
Reply Quote
Ok, so I've rebuilt enabling forced IRQ threading. What I am using is: stock openSUSE kernel (5.5.11) + zen patches from all branches + missing Con Kolivas patches (basically only one of them is missing in zen, i.e. the one that sets Preemptible as the recommended Preemption type for desktops, in place of Voluntary Preemption).
I've tried your tweaks and found out that running pulseaudio using that schedtool command has no effect. Playing with yield_type has no effect as well. The only thing that seems to contain the issue is runing the cpu using intel_pstate's performance gov instead of powersave, which indeed is the default governor in openSUSE. This does not fix the issue, but these audio skips definitely happen less often when using performance. The problem is that my pc is a laptop, so running the CPU at the max freq forever is definitely not an option for me, as it would lead to sky-high battery drain.

Anyway, when logging into an Xorg KDE session, audio works perfectly even with MuQSS and the powersave governor. This issue is driving me crazy...
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1122
Reply Quote
So, I tried using Wayland myself with KDE. I'm running Arch and the latest stable packages for all the major components.

One thing I noticed is that kwin for wayland automatically sets its own priority to SCHED_RR. As a side effect, child processes seem to also get this higher priority. Open up ksysguard and sort by niceness, you should have a lot of processes running at RR 1.

Here's the commit that changed its behavior: phabricator.kde.org/R108:7c8003f7f6212ccad7de652943f94d501365d30f

As an experiment, what happens if you reset all these processes to a nice of 0 with the normal scheduler?

Unfortunately, I'm not able to reproduce the sound skipping that you're getting. Maybe it's the version and/or implementation OpenSuse chose?

Here's my system for comparison:

:: Code ::
$ inxi -b
System:    Host: steven-thinkpad-x1c7 Kernel: 5.5.11-lqx1-1-lqx x86_64 bits: 64 Desktop: KDE Plasma 5.18.3
           Distro: Antergos Linux 16.4-ISO-Rolling
Machine:   Type: Laptop System: LENOVO product: 20QD001VUS v: ThinkPad X1 Carbon 7th serial: <root required>
           Mobo: LENOVO model: 20QD001VUS v: SDK0J40697 WIN serial: <root required> UEFI: LENOVO v: N2HET46W (1.29 )
           date: 02/21/2020
Battery:   ID-1: BAT0 charge: 49.1 Wh condition: 49.1/51.0 Wh (96%)
CPU:       Quad Core: Intel Core i7-8565U type: MT MCP speed: 1501 MHz min/max: 400/4600 MHz
Graphics:  Device-1: Intel UHD Graphics 620 driver: i915 v: kernel
           Display: wayland server: X.Org 1.20.7 driver: modesetting resolution: 1920x1080~60Hz, 1920x1200~60Hz
           OpenGL: renderer: Mesa Intel UHD Graphics (Whiskey Lake 3x8 GT2) v: 4.6 Mesa 19.3.4
Network:   Device-1: Intel Cannon Point-LP CNVi [Wireless-AC] driver: iwlwifi
           Device-2: Intel Ethernet I219-V driver: e1000e
Drives:    Local Storage: total: 476.94 GiB used: 306.84 GiB (64.3%)
Info:      Processes: 357 Uptime: 10m Memory: 15.45 GiB used: 4.76 GiB (30.8%) Shell: bash inxi: 3.0.37

Back to top
Tomoms
Status: Curious
Joined: 20 Mar 2020
Posts: 5
Reply Quote
It is indeed true that in a Wayland session I have lots of processes with RR 1 niceness (kwin, plasmashell, latte-dock and several others).

You suggested that I try to reset their niceness to 0 with the normal scheduler, but I've never done such thing in the past so I'd like to ask you if this is the command I should run:
:: Code ::
renice -n 0 -p $(pgrep name_of_the_process)

Should I run it manually for each of the RR 1 processes? And what do you mean by saying "with the normal scheduler"? No MuQSS?

My system is described here:
:: Code ::
> inxi -b
System:    Host: x556uv.fritz.box Kernel: 5.5.11-Tom x86_64 bits: 64 Desktop: KDE Plasma 5.18.3
           Distro: openSUSE Tumbleweed 20200322
Machine:   Type: Laptop System: ASUSTeK product: X556UV v: 1.0 serial: <superuser/root required>
           Mobo: ASUSTeK model: X556UV v: 1.0 serial: <superuser/root required> UEFI: American Megatrends v: X556UV.316
           date: 01/25/2019
Battery:   ID-1: BAT0 charge: 8.5 Wh condition: 16.7/38.0 Wh (44%)
CPU:       Dual Core: Intel Core i5-6200U type: MT MCP speed: 500 MHz min/max: 400/2300 MHz
Graphics:  Device-1: Intel Skylake GT2 [HD Graphics 520] driver: i915 v: kernel
           Device-2: NVIDIA GM108M [GeForce 920MX] driver: nouveau v: kernel
           Display: wayland server: X.org 1.20.7 driver: modesetting,nouveau unloaded: fbdev,vesa
           resolution: <xdpyinfo missing>
           OpenGL: renderer: Mesa DRI Intel HD Graphics 520 (SKL GT2) v: 4.6 Mesa 20.0.1
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169
           Device-2: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter driver: ath10k_pci
Drives:    Local Storage: total: 232.89 GiB used: 42.82 GiB (18.4%)
Info:      Processes: 254 Uptime: N/A Memory: 11.57 GiB used: 1.45 GiB (12.6%) Shell: bash inxi: 3.0.38


Thanks for dedicating some of your time to my tricky question :)
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1122
Reply Quote
If you don't already have schedtool, you'll want to install that since it supports changing the scheduling type per process.

As for the command, it'll look like: schedtool -N -n0 $(pidof name_of_process)

However, I'd recommend trying to reset the priorities through ksysguard if you can. Ksysguard throws up a dialog where you can change niceness, scheduling type, and IO scheduling. For what you're doing it'll probably be a lot easier than collecting all the PIDs.

EDIT:
:: Code ::
Battery:   ID-1: BAT0 charge: 8.5 Wh condition: 16.7/38.0 Wh (44%)

You should get a replacement for that battery!
Back to top
Latest news
Tomoms
Status: Curious
Joined: 20 Mar 2020
Posts: 5
Reply Quote
Ok, so after setting normal scheduler & 0 nice level for all those RR 1 processes, and after enabling realtime scheduling with realtime prio = 3 in /etc/pulse/daemon.conf (the "0 nice level" tweak alone wasn't enough), I managed to listen to 4 minutes of music without skips or glitches, which is a good result. I'll keep testing this setup to see if the good result is confirmed.

OT: yes, the battery is old, and the level and capacity percentages are very unreliable. The capacity reported by inxi swings between 44 and 65.. I'll definitely replace it when it won't last long enough for my needs, but for now, it does.
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1122
Reply Quote
Ah, that's great! My suspicion is the exaggerated priority levels for kwin, and the fact that it leaks to subsidiary applications, is fooling MuQSS into thinking audio is less important.

In general, MuQSS takes priority differences much more seriously than CFS. This means running desktop applications and the window manager at SCHED_RR can completely interrupt pulseaudio, even if you're running a nice of -19 without realtime priorities. That's how you're probably getting audio skips.

Especially with your CPU, the i5-6200u, having only two physical cores doesn't give you much headroom when many applications are allowed to clobber deadlines of everything on the system.

Crossing my fingers this was the problem!
Back to top
Display posts from previous:   

All times are GMT - 8 Hours