Occasional lockups
gelabs
Status: Interested
Joined: 02 May 2017
Posts: 16
Reply Quote
Hello,

My PC often locks up (hard) and today it let me grab some of the log :

:: Code ::
May  2 13:53:08 xxx kernel: ppdev: user-space parallel port driver
May  2 13:53:09 xxx kernel: lp: driver loaded but no devices found
May  2 13:53:09 xxx kernel: st: Version 20160209, fixed bufsize 32768, s/g segs 256
May  2 13:53:13 xxx kernel: NET: Registered protocol family 17
May  2 13:54:42 xxx kernel: ------------[ cut here ]------------
May  2 13:54:42 xxx kernel: invalid opcode: 0000 [#1] PREEMPT SMP
May  2 13:54:42 xxx kernel: Modules linked in: af_packet joydev st sr_mod cdrom lp parport_pc ppdev parport fuse dm_mod vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) cpuid snd_seq_dummy snd_hrtimer xt_multiport iptable_filter uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media input_leds hid_generic usbhid pci_stub binfmt_misc iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass eeepc_wmi crct10dif_pclmul asus_wmi crc32_pclmul sparse_keymap rfkill ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd snd_ice1712 snd_cs8427 snd_i2c snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_mpu401_uart glue_helper snd_ac97_codec cryptd ac97_bus sg lpc_ich shpchp battery tpm_infineon evdev acpi_pad tpm_tis tpm_tis_core tpm snd_seq_midi snd_seq_midi_event
May  2 13:54:42 xxx kernel: snd_aloop snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_pcm asus_atk0110 coretemp snd_seq snd_seq_device snd_timer snd soundcore ip_tables x_tables ipv6 crc_ccitt autofs4 ext4 crc16 jbd2 fscrypto mbcache sd_mod mxm_wmi ahci libahci crc32c_intel i915 i2c_i801 libata scsi_mod r8169 intel_gtt i2c_algo_bit mii drm_kms_helper ehci_pci ehci_hcd drm xhci_pci xhci_hcd i2c_core fan thermal rtc_cmos wmi fjes video button [last unloaded: parport_pc]
May  2 13:54:42 xxx kernel: CPU: 0 PID: 13572 Comm: kworker/0:1 Tainted: G           O    4.10.0-11.1-liquorix-amd64 #1 liquorix 4.10-2
May  2 13:54:42 xxx kernel: Hardware name: ASUS All Series/Z97-K, BIOS 2604 05/20/2015
May  2 13:54:42 xxx kernel: Workqueue: cgroup_destroy css_killed_work_fn
May  2 13:54:42 xxx kernel: task: ffff88019befe800 task.stack: ffffc9000db80000
May  2 13:54:42 xxx kernel: RIP: 0010:bfq_entity_service_tree+0x136/0x1c0
May  2 13:54:42 xxx kernel: RSP: 0018:ffffc9000db83b70 EFLAGS: 00010046
May  2 13:54:42 xxx kernel: RAX: 0000000000000000 RBX: ffff8803e802a810 RCX: 0000000000000001
May  2 13:54:42 xxx kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8803e802a810
May  2 13:54:42 xxx kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88041fa177a0
May  2 13:54:42 xxx kernel: R10: 0000000000000008 R11: ffffffff81c8d8c0 R12: 0000000000000000
May  2 13:54:42 xxx kernel: R13: ffff880265aa3eb8 R14: ffff8803e802a810 R15: 0000000000000000
May  2 13:54:42 xxx kernel: FS:  0000000000000000(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000
May  2 13:54:42 xxx kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May  2 13:54:42 xxx kernel: CR2: 00007fd2b9935000 CR3: 0000000001c09000 CR4: 00000000001406f0
May  2 13:54:42 xxx kernel: Call Trace:
May  2 13:54:42 xxx kernel: ? __schedule+0x69c/0xf80
May  2 13:54:42 xxx kernel: ? _raw_spin_unlock+0x11/0x30
May  2 13:54:42 xxx kernel: ? bfq_pd_offline+0x169/0x360
May  2 13:54:42 xxx kernel: ? resched_best_idle+0x151/0x1c0
May  2 13:54:42 xxx kernel: ? blkg_destroy+0x61/0x340
May  2 13:54:42 xxx kernel: ? try_preempt+0x123/0x160
May  2 13:54:42 xxx kernel: ? __schedule+0x69c/0xf80
May  2 13:54:42 xxx kernel: ? ttwu_do_wakeup+0x8a/0xa0
May  2 13:54:42 xxx kernel: ? _raw_spin_unlock_irqrestore+0x1b/0x30
May  2 13:54:42 xxx kernel: ? try_to_wake_up+0x201/0x4a0
May  2 13:54:42 xxx kernel: ? schedule_preempt_disabled+0x17/0x150
May  2 13:54:42 xxx kernel: ? _raw_spin_lock+0xe/0x30
May  2 13:54:42 xxx kernel: ? preempt_count_add+0x44/0xa0
May  2 13:54:42 xxx kernel: ? __mutex_lock_slowpath+0x6c/0x310
May  2 13:54:42 xxx kernel: ? wake_up_q+0x4f/0x70
May  2 13:54:42 xxx kernel: ? blkcg_css_offline+0x4a/0x90
May  2 13:54:42 xxx kernel: ? css_killed_work_fn+0x58/0x100
May  2 13:54:42 xxx kernel: ? process_one_work+0x1e8/0x4b0
May  2 13:54:42 xxx kernel: ? worker_thread+0x42/0x540
May  2 13:54:42 xxx kernel: ? kthread+0x13f/0x180
May  2 13:54:42 xxx kernel: ? process_one_work+0x4b0/0x4b0
May  2 13:54:42 xxx kernel: ? __kthread_create_on_node+0x150/0x150
May  2 13:54:42 xxx kernel: ? SyS_exit_group+0x2f/0x90
May  2 13:54:42 xxx kernel: ? ret_from_fork+0x26/0x40
May  2 13:54:42 xxx kernel: Code: b8 08 08 00 00 48 85 ff 75 25 48 8d 44 6d 00 48 c1 e0 04 49 8d 44 04 10 e9 58 ff ff ff 0f 0b 0f 0b 31 c0 eb b3 48 8b 42 08 eb a5 <0f> 0b 0f 0b 89 e8 8b 93 cc 00 00 00 4c 8d 44 24 10 48 8d 04 40
May  2 13:54:42 xxx kernel: ---[ end trace 92b76b92dc07ab10 ]---
May  2 13:54:42 xxx kernel: note: kworker/0:1[13572] exited with preempt_count 2
May  2 13:59:15 xxx kernel: Task dump for CPU 0:
May  2 13:59:15 xxx kernel: pool            R  running task        0 13591      1 0x00000008
May  2 13:59:15 xxx kernel: Call Trace:
May  2 13:59:15 xxx kernel: ? sd_ioctl+0x7b/0x100 [sd_mod]
May  2 13:59:15 xxx kernel: ? blkdev_ioctl+0x882/0xb70
May  2 13:59:15 xxx kernel: ? _raw_spin_lock+0xe/0x30
May  2 13:59:15 xxx kernel: ? _raw_spin_unlock+0x11/0x30


I think it happens mostly when I logout from ssh when I am not at home.

:: Code ::

inxi -bxx
System:    Host: xxx Kernel: 4.10.0-11.1-liquorix-amd64 x86_64 (64 bit gcc: 6.3.0)
           Desktop: KDE Plasma 5.8.6 (Qt 5.7.1) dm: sddm,sddm Distro: Debian GNU/Linux 9 (stretch)
Machine:   Device: desktop System: ASUS product: All Series
           Mobo: ASUSTeK model: Z97-K v: Rev X.0x UEFI [Legacy]: American Megatrends v: 2604 date: 05/20/2015
CPU:       Quad core Intel Core i7-4790K (-MCP-) speed/max: 4353/4400 MHz
Graphics:  Card: Intel Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller
           bus-ID: 00:02.0 chip-ID: 8086:0412
           Display Server: X.Org 1.19.3 driver: modesetting Resolution: 1920x1080@60.00hz
           GLX Renderer: Mesa DRI Intel Haswell Desktop GLX Version: 3.0 Mesa 13.0.6 Direct Rendering: Yes
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           driver: r8169 v: 2.3LK-NAPI port: e000 bus-ID: 03:00.0 chip-ID: 10ec:8168
Drives:    HDD Total Size: 1006.2GB (43.5% used)
Info:      Processes: 189 Uptime: 20 min Memory: 734.4/15487.9MB
           Init: systemd v: 232 runlevel: 5 Gcc sys: 6.3.0 alt: 4.8/4.9/5
           Client: Shell (bash 4.4.111 running in konsole) inxi: 2.3.5


HyperThreading, SpeedStep/C-States disabled in the BIOS.
Jackd2 is running in realtime with a2jmidi.
There are some USB devices always plugged-in :

:: Code ::

Audio:  Card-1 VIA ICE1712 [Envy24] PCI Multi-Channel I/O Controller driver: snd_ice1712
           Card-2 Logitech Webcam C170 driver: USB Audio
           Card-3 Harman driver: USB Audio
           Card-4 Roland UM-2(C/EX) driver: USB Audio
           Card-5 EGO SYStems driver: USB Audio
           Sound: Advanced Linux Sound Architecture v: k4.10.0-11.1-liquorix-amd64


Please let me know if you need additional information.
Back to top
damentz
Status: Assistant
Joined: 09 Sep 2008
Posts: 1117
Reply Quote
Hi gelabs,

I looked this over with Con briefly and it doesn't look like it's a bug due to any specific feature in Liquorix or MuQSS. Looking online, this bug has surfaced but never made it to any kernel developer:

bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736

The only common thing between your report and what I've seen online is that there's some modification to cgroups happening, maybe due to request of a process you're running. Can you think of any process that might specifically be trying to make its own cgroup to protect its CPU usage? I see you mentioned JackD - maybe its realtime configuration is spawning and destroying cgroups?

And also, MuQSS doesn't respect the rules of cgroups, so they can be made and destroyed, but CPU fairness won't be allocated the same way CFS allocates its time.

EDIT: One thing interesting in your stack trace is the invocation of bfq_pd_offline. Perhaps during this time is your system configured to disconnect or sleep drives? This could be a bug in BFQ where there's still some latent process that thinks the group BFQ destroyed still exists.
Back to top
gelabs
Status: Interested
Joined: 02 May 2017
Posts: 16
Reply Quote
Hello Damentz,

Thank you for your insight, I will definitely look into those cgroups. About bfq_pd_offline, I don't know as I only have SSD drives. But I will search anyway.

Thank you for this kernel, by the way, it's very nice and powerful for working with music software (midi/audio) :)
Back to top
Display posts from previous:   

All times are GMT - 8 Hours