>
> lm@bitmover.com said:
> > I'm 100% in agreement with the idea that all code paths through the
> > kernel should be short and sweet, but that isn't always the case. All
> > it takes is one misbehaving driver that hangs onto the CPU too long
> > and you missed your deadline.
>
> Don't Do That Then. The driver is broken, not the model. Fix it.
>
> It ought to be possible to devise a test for debugging drivers, similar in
> concept to the slab poisoning, which will BUG() if a process has blocked for
> too long. Could the NMI watchdog or something similar be extended to do
> this? Would we get a useful backtrace?
>
> This can automatically find many of the instances of badly-behaved code
> which causes high latency, but can also be turned off in production kernels.
One IBM operating system, used mainly for running airline reservation
systems, use the algorithm of killing tasks if they took more than N (I
think it was 100) milliseconds. I remember asking a software expert what
the airline users did. If the response didn't arrive at the expected time,
the user learned to press enter again.
So I think time poisoning would be an excellent debugging technique to
flush out outliers.
john alvord
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/