Page: 1, 2, 3  Next

minor nvidia woes again
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
During the smxi d-u I just completed a few minutes ago, I had no errors until the graphics selection. When I relinked nvidia, I got a failure return code 1. After reboot and relink, all is well.

/var/log/smxi.log doesn't show any errors related to this, just

...
gfx command string: sgfxi -c -o 177.80 -j 1 -DX
...

and the usual utility start/end messages.
Back to top
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
Not obvious from this post, but I'm running sidux and using aptitude.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4124
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
I believe I may know what is wrong: I have a similar obscure problem that I finally figured out:

I had taken a motherboard I'd replaced for a client, which had exhibited weird behaviors, so we just upgraded since it was time anyway, then I used it to be a test box with nvidia.

Very soon after, I realized it was locking up at random intervals, I checked dmesg and the error outputs from terminal, and after a while I realized that it almost certainly has a bad but not dead ide controller chip that makes writes to the hard disks fail at random intervals, but they work at others.

Things like driver installs for example triggered that issue, since more data than usual would get written quickly.

Given your consistent issues that only you are experiencing, I'm going to guess you have a similar subtle hardware problem, that very close studies of your dmesg logs might but might not show evidence of.
Back to top
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
I'm guessing that this is not the problem.

1. I'm not experiencing any random lockups
2. My drives are sata
3. No harddrive (or any other) errors in my logs
4. Memory replaced a couple of months ago.
5. I would expect something else in the hundreds of packages installed every week to fail if this were a hardware problem.

I don't know whether this is an smxi problem or not, but is there any way for you to capture more informative error messages from the nVidia installer?
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4124
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
the nvidia installer has its own log, after a failure, copy it from /var/log/nvidia ... can't remember its file name. It gets overwritten each time, so make sure to get the current one.

an exit of 1 is one of the more useless pieces of information a program can give. About as bad as the single apt error 100 (internal apt error, all), or 1, user exit of apt, likewise useless.
Back to top
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
Thanks, I'll be sure to look for that if the installer fails again. I'm sure the one I have was overlaid.
Back to top
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
OK, here's the scoop. The nvidia installer log reports

...
ERROR: Unable to find the kernel source tree for the currently running kernel.
...

So, apparently when a new kernel has been installed, the script is attempting to build the nvidia module based on the running kernel, not the newly installed kernel.

Hope this helps.

As usual, rboot and run smxi/sgfxi works without a hitch.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4124
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
There's something wrong with your system, but what it is, I can't tell you.

This script is run about 2000+ times a week, and about 2/3 of those use the smxi installer, which installs driver to latest kernel.

If you can figure out what is wrong with the system, that would be useful.

You're not using some weird file system or something, are you?

Again, the driver has no choice, if you are doing this right, ie: install kernel, do not exit smxi, use smxi to launch sgfxi, no exits, then smxi + sgfxi MUST install to new kernel, that is not an option.

If you are starting sgfxi manually, then you of course have to use the -K option to do that, in which case, sgfxi MUST install to the new kernel prior to reboot. This is not optional, it can't sometimes happen and sometimes not happen.

So there is something wrong either in how you are doing this, or in your system, what that is I think you should give some careful thought to, look for differences, non standard things you are doing with your setup. Again, only you have this problem, so it's very safe to say that only your system has whatever it is that makes this happen. The trick is to find it.

Assuming you are doing the above steps correctly always, then your system almost certainly has some obscure hardware problem.

If you are installing new kernels manually, you may be forgetting to install kernel headers, I really can't say, please try to figure out what it is that you are doing that could be different, that's how you will find a solution, this is not related to any default sgfxi/smxi behavior, and if it's some obscure issue that only your system triggers, it would be good to know what it is, but since I have no access to your system, I can't tell you what that is.
Back to top
hoodwink
Status: Interested
Joined: 17 Sep 2008
Posts: 21
Location: Denver Area
Reply Quote
OK,

1.Nothing weird or unstable about my system.
2. No changes ever to the approach used for maintenance
init 3
login
smxi - d-u followed by nvidia latest install
some times reboot sometimes just start X
3. plain ole ext3 file system.
4. This system was rbuilt at erebos level.
5. The only change I've noted over the months, is that you (the smxi coder) used to install kernel related upgrades then kernel then d-u then nvidia. For some time now, the upgrades to the kernel come near the end of the d-u.
6. The only changes I've made to my system in recent months (other than to add a few applications here and there) is to change to aptitude as the d-u method in smxi.
7. here's my smxi configuration, in case that helps
:: Code ::
apt-type=aptitude
busybox-fix-1
debian-mirrors-1
du-connection-drop-1
gpm-fix-1
kernel-directory-update-1
kernel-metapackage-1
keyrings-sidux-1
keyrings-update-4
libc6-fix-4
meta-package-selection=automatic
nvidia-sse-test-1
RememberResponses
script-last-du-complete=2008-11-01-13:46
script-last-used=2008-11-01-13:50
smxi-kernel-mirror-2

8. I've never installed a kernel manually. I have removed older kernels (using smxi) from time to time.
9. I've never installed nvidia manually.
10. I'm not blaming you, but something is amiss. I have no problem with rebooting to get nvidia linked, but I shouldn't have to do that.
11. I fail to understand how anything I've done has anything to do with a d-u that reports no errors followed by an nvidia attempt (within smxi) that can't find the kernel source that was just installed by the d-u within the same smxi run.
Back to top
techAdmin
Status: Site Admin
Joined: 26 Sep 2003
Posts: 4124
Location: East Coast, West Coast? I know it's one of them.
Reply Quote
are you using kernel metapackages?

smxi doesn't know about those kernels, and i don't try to keep up with them.

It sounds like you are.

smxi still offers the kernel install option pre du unless you have an older kernel, then it tells you you do, and you need to install kernel post du.

There are only two things that make this not appear, one, if you are using kernel metapackages, and 2, if you manually turned off the show kernel feature, which it appears you did not do, so if you see no pre du kernel install option, then you almost certainly are using metapackages.

There is an advanced tweak kernel install to remove those.

If this is the case, then there is a small bug I think re how the packages are getting handled, but need more information to figure it out.

<update>I missed it, as I thought, you are in fact using metapackages, which do not work with the installer, or shouldn't.

meta-package-selection=automatic

This question was asked when you first ran smxi.

Personally, on sidux, the first thing I do is remove kernel metapackages, sidux releases new kernels sometimes once a day, or a few times a week, which is why I prefer to do the manual kernel install.

However, it does look like there may be a bug in the smxi detection of kernels and metapackages, in no case should it, as far as I remember, be trying to install nvidia to the new kernel at all if you used kernel metapackages.

Just get rid of the metapackages using the built in smxi metapackage remover, and your problems will go away.

Since the very first thing I do on all my installs is remove metapackages, I never get much data about that stuff from my own systems.
Back to top
Display posts from previous:   
Page: 1, 2, 3  Next
All times are GMT - 8 Hours