nVidia module fails to load: Unknown symbol mtrr_del
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
Hi everyone. I've had one of those terrifying, but rather normal problems with the nVidia drivers for ages now. Some time ago I discovered that the nVidia drivers would ONLY install on kernel 3.16, and not on any other kernel no matter what I did. I gave up, the computer still worked and I had better things to do.

In all that time I got slightly more clever. I looked at the sgfxi log and realized that the module was building without errors, and problems occurred only when trying to load the module. So I did dmesg nvidia, and found several error messages like this:

nvidia: Unkown symbol mtrr_del (err 0)

It looks like a patch is available for this problem, but the patch must be made to the kernel:

https://bugs.archlinux.org/task/47092

Can either sgfxi or the Liquorix kernel be updated to fix this please?

My problem specifically is with mtrr_del and mtrr_add, and seems to be caused by this change in the kernel source, which can simply be reverted:

git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2baa891e42d84159b693eadd44f6fe1486285bdc
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 3735
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
I read that thread, and it seems like as many people, or more, had total failures using that patch than had success.

Also, sgfxi does NOT patch the kernel, though it will now and then apply a patch to the nvidia driver itself when required.

If you have not filed a bug report to nvidia, you should do so immediately:: https://devtalk.nvidia.com/default/topic/522835/linux/if-you-have-a-problem-please-read-this-first/

Follow those directions. Generating the nvidia bug report only takes a few minutes.

Of course, you need to run the bug report tool to capture this failure, so install to current kernel, install current nvidia driver, when it fails run the bug report tool from the terminal.

sgfxi can't do anything about this, and given how unreliable the patch appears to be for the kernel, it's unlikely liquorix will add it to their kernel I suspect.
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
I made a report here:

https://devtalk.nvidia.com/default/topic/909059/linux/nvidia-kernel-module-fails-to-load-since-kernel-3-16-missing-mtrr-symbols/

I don't expect much of a response since it is a known issue. I will try out the patch if I can figure out how to do that.
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
There is another patch here:

https://anonscm.debian.org/viewvc/pkg-nvidia/packages/nvidia-graphics-drivers-legacy-304xx/trunk/debian/module/debian/patches/disable-mtrr.patch?view=markup
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
It works! I figured that patch was for the Debian default drivers, so I used nvidia-detect. If I use sgfxi it still gives the missing symbol error.

However if I understand it right, old AGP cards will suffer a performance loss because they depend on MTRR which is disabled by this patch. The only solution for that will be to wait for the nVidia to update their drivers, assuming they won't just drop support for AGP cards.

:: Code ::
~$ inxi -bxx
System:    Host: ronin Kernel: 4.4-0.dmz.2-liquorix-amd64 x86_64 (64 bit gcc: 5.3.1)
           Desktop: Xfce 4.12.3 (Gtk 2.24.28) dm: lightdm Distro: Debian GNU/Linux stretch/sid
Machine:   Mobo: ASUSTeK model: M5A99FX PRO R2.0 v: Rev 1.xx Bios: American Megatrends v: 2501 date: 04/07/2014
CPU:       Octa core AMD FX-8350 Eight-Core (-MCP-) speed/max: 1400/4000 MHz
Graphics:  Card: NVIDIA G73 [GeForce 7600 GS] bus-ID: 01:00.0 chip-ID: 10de:0392
           Display Server: X.Org 1.17.3 driver: nvidia Resolution: 1280x960@60.00hz
           GLX Renderer: GeForce 7600 GS/PCIe/SSE2 GLX Version: 2.1.2 NVIDIA 304.131 Direct Rendering: Yes
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           driver: r8169 v: 2.3LK-NAPI port: a000 bus-ID: 0a:00.0 chip-ID: 10ec:8168
Drives:    HDD Total Size: 521.6GB (27.8% used)
Info:      Processes: 326 Uptime: 0 min Memory: 667.8/7887.2MB
           Init: systemd v: 228 runlevel: 5 default: 2 Gcc sys: 5.3.1 alt: 4.4/4.6/4.8/4.9
           Client: Shell (bash 4.3.421 running in xfce4-terminal) inxi: 2.2.28

Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 3735
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
is this a patch for the nvidia driver or the kernel? It's not clear to me.

sgfxi can add patches, but I have no way to make it know when the card is an AGP card.

https://devtalk.nvidia.com/default/topic/893282/304-128-and-kernel-4-3-can-compile-but-cannot-insert-it-mtrr-symbols-related-errors-/

that's more clear, so the kernel must be patched, so nothing sgfxi does on its end would matter, ok.

My guess is there will be a driver out with this support fairly soon.
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
It's a patch for the drivers. The file patched is:

/usr/src/nvidia-legacy-304xx-304.131/nv-linux.h
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 3735
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
No, that link clearly stated that the patched driver without the kernel patched doesn't work.

If you can find me a non debian nvidia driver patch, aka, a real patch for the actual nvidia driver itself, I'll look at it and add it to sgfxi, but I didh't find anything except for kernel patches when I searched.
Back to top
Mono
Status: Contributor
Joined: 21 Jun 2012
Posts: 62
Reply Quote
That link is old and not entirely accurate - the kernel patch was just a first attempt to hack the problem away. Maybe I'm just thick, but I just posted an inxi with a working nvidia driver using an unpatched Liquorix kernel. Text of the patch:

:: Code ::
+/*
14   + * As of version 304.131, os-agp.c and os-mtrr.c still use deprecated
15   + * kernel APIs for mtrr which are no longer exported since 4.3, causing
16   + * the module to error out when loaded.
17   + */


Only the AGP drivers use mtrr, which isn't in new kernels anyways, so this patch disables mtrr in the nvidia driver so it won't look for the mtrr symbols that are now missing from the kernel.

Disabling mtrr doesn't even disable the AGP drivers, just slows them down.
Back to top
Display posts from previous:   

All times are GMT - 8 Hours