Re: lockups with 2.4.20 (tg3? net/core/dev.c|deliver_to_old_ones)

Pete Zaitcev (zaitcev@redhat.com)
Fri, 14 Feb 2003 16:02:16 -0500


> Since sometime in December two systems we have on site using P4 HT (one
> Dell 2650 and one Dell 4600, both dual CPU, both ht/mce capable) have been
> locking up without any kernel output and without sysrq keys working (the
> keyboard is locked solid).
>[...]
> Using nmi_watchdog I've managed to get a stack track and ran ksymoops over
> it (attached).

Good report. To tell the truth, I know that this lockup exists,
there's an RH issue-tracker item against me on this.
It is different from the old "porkchop" lockup, which DaveM and
Jeff Garzik fixed.

The stumbling block is that NMI oopser catches a thread which
gets stuck because of the lock, but this does not explain
how the lock was taken.

I think the best resolution would be an instrumentation patch
which records lock takers, and prints them when the thing is
forcefuly oopsed. I should come with it eventually, if someone
does not beat me to it (I wish they did, actually :-)

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/