>On Thu, Oct 24, 2002 at 10:36:22AM -0500, Corey Minyard wrote:
>
>  
>
>>Is there any way to detect if the nmi watchdog actually caused the 
>>timeout?  I don't understand the hardware well enough to do it without 
>>    
>>
>
>You can check if the counter used overflowed :
>
>#define CTR_OVERFLOWED(n) (!((n) & (1U<<31)))
>#define CTRL_READ(l,h,msrs,c) do {rdmsr(MSR_P6_PERFCTR0, (l), (h));} while (0)
>
>                CTR_READ(low, high, msrs, i);
>                if (CTR_OVERFLOWED(low)) {
>			... found
>
>like oprofile does.
>
Ok, thanks, I'll add that to the nmi_watchdog code.
>
>I've accidentally deleted your patch, but weren't you unconditionally
>returning "break out of loop" from the watchdog ? I'm not very clear on
>the difference between NOTIFY_DONE and NOTIFY_OK anyway...
>
The comments on these are:
#define NOTIFY_DONE        0x0000        /* Don't care */
#define NOTIFY_OK        0x0001        /* Suits me */
#define NOTIFY_STOP_MASK    0x8000        /* Don't call further */
I'mt taking these to mean that NOTIFY_DONE means you didn't handle it, 
NOTIFY_OK means you did handle it, and you "or" on NOTIFY_STOP_MASK if 
you want it to stop.  I'm thinking that stopping is probably a bad idea, 
if the NMI is really edge triggered.
>
>  
>
>>Plus, can't you get more than one cause of an NMI?  Shouldn't you check 
>>them all?
>>    
>>
>
>Shouldn't the NMI stay asserted ? At least with perfctr, two counters
>causes two interrupts (actually there's a bug in mainline oprofile on
>that that I'll fix when Linus is back)
>
There's a comment in do_nmi() that says that the NMI is edge triggered. 
 If it is, then you have to call everything.  I'd really like a manual 
on how the timer chip works, I'll see if I can hunt one down.
-Corey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/