Re: [patch] serial console vs NMI watchdog

george anzinger (george@mvista.com)
Sun, 11 Mar 2001 20:43:16 -0800


Keith Owens wrote:
>
> On Sun, 11 Mar 2001 08:44:24 +0100 (CET),
> Ingo Molnar <mingo@elte.hu> wrote:
> >Andrew,
> >
> >your patch looks too complex, and doesnt cover the case of the serial
> >driver deadlocking. Why not add a "touch_nmi_watchdog_counter()" function
> >that just changes last_irq_sums instead of adding locking? This way
> >deadlocks will be caught in the serial code too. (because touch_nmi() will
> >only "postpone" the NMI watchdog lockup event, not disable it.)
>
> kdb has to completely disable the nmi counter while it is in control.
> All interrupts are disabled, all but one cpus are spinning, the control
> cpu does busy wait while it polls the input devices. With that model
> there is no alternative to a complete disable.
>
Consider this. Why not use the NMI to sync the cpus. Kdb would have a
function that is called each NMI. If it is doing nothing, just return
false, else, if waiting for this cpu, well here it is, put it in spin
AFTER saving where it came from so the operator can figure out what it
is doing. In kgdb I just put the interrupt registers in the task_struct
where they are put when a context switch is done. Then the debugger can
do a trace, etc. on that task. A global var that the debugger can see
is also set to the cpus, "current".

If the cpu is already spinning, return to the nmi code with a true flag
which will cause it to ignore the nmi. Same thing if it is the cpu that
is doing debug i/o.

I went to this for kgdb after the system failed to return from the call
to force the other cpus to execute a function (which means they have to
be alive). For extra safety I also time the sync. If one or more
expected cpus, don't show while looping reading the cycle counter, the
code just continues with out the sync.

George
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/