Re: scheduling problem?

Anton Blanchard (anton@linuxcare.com.au)
Wed, 3 Jan 2001 01:01:03 +1100



Hi Mike,

> I am seeing (what I believe is;) severe process CPU starvation in
> 2.4.0-prerelease. At first, I attributed it to semaphore troubles
> as when I enable semaphore deadlock detection in IKD and set it to
> 5 seconds, it triggers 100% of the time on nscd when I do sequential
> I/O (iozone eg). In the meantime, I've done a slew of tracing, and
> I think the holder of the semaphore I'm timing out on just flat isn't
> being scheduled so it can release it. In the usual case of nscd, I
> _think_ it's another nscd holding the semaphore. In no trace can I
> go back far enough to catch the taker of the semaphore or any user
> task other than iozone running between __down() time and timeout 5
> seconds later. (trace buffer covers ~8 seconds of kernel time)

Did this just appear in recent kernels? Maybe bdflush was hiding the
situation in earlier kernels as it would cause io hogs to block when
things got only mildly interesting.

You might be able to get some useful information with ps axl and checking
the WCHAN value. Of course it wont be possible if like nscd you cant get
ps to schedule :)

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/