Re: NMI watchdog stability

Richard B. Johnson (root@chaos.analogic.com)
Thu, 19 Sep 2002 09:20:07 -0400 (EDT)


On Thu, 19 Sep 2002, John Levon wrote:

> On Wed, Sep 18, 2002 at 04:55:13PM -0700, Jonathan Lundell wrote:
>
> > >It was causing SMP boxes to crash mysteriously after
> > >several hours or days. Quite a lot of them. Nobody
> > >was able to explain why, so it was turned off.
> >
> > This was in the context of 2.4.2-ac21. More of the thread,with no
> > conclusive result, can be found at
> > http://www.uwsg.iu.edu/hypermail/linux/kernel/0103.2/0906.html
> >
> > Was there any resolution? Was the problem real, did it get fixed, and
>
> Some machines corrupt %ecx on the way back from an NMI. Perhaps that was
> the factor all the people with problems saw.
>
> regards
> john
>

How is this? The handler saves/restores register values. The fact that
some interrupt occurred has no effect upon the contents of general
registers, only selectors (segments), EIP, ESP, and the return address
on the stack. If ECX was being destroyed, it was software that did it,
not some "machine". What kernel version destroys ECX upon NMI?

Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
The US military has given us many words, FUBAR, SNAFU, now ENRON.
Yes, top management were graduates of West Point and Annapolis.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/