Re: all processes waiting in TASK_UNINTERRUPTIBLE state

Jeff Dike (jdike@karaya.com)
Fri, 29 Jun 2001 15:40:16 -0500


To recap the story so far, a number of people have reported seeing processes
hang indefinitely in TASK_UNINTERRUPTIBLE in __wait_on_buffer or __lock_page,
both on physical boxes and under UML. It was found to be reproducable under
UML, but not on a native kernel as far as I know.

I've fixed this problem in UML.

The short story :

The bug was UML-specific and specific in such a way that I don't think it's
possible to find the bug in the native kernel by making analogies from the UML
bug.

The long story :

First, two pieces of background information
- UML's ubd block driver performs asynchronous I/O by using a separate thread
to perform the I/O. The driver's request routine writes the request to the
I/O thread over a file descriptor. The thread performs the request by calling
either read() or write() on the host. When that call returns, it writes the
results back to UML, causing a SIGIO. That goes through the normal IRQ system
and ends up in the ubd interrupt handler, which finishes the request, and, if
the device request queue isn't empty, starts the next request.
- A couple of weeks ago, I made a change to reduce the number of clock ticks
that UML loses under load. The clock is implemented with SIGALRM and
SIGVTALRM. The change involved leaving those signals enabled all the time,
and having the handler decide whether it was safe to invoke the timer irq. If
not, it bumped a missing_ticks counter and returned. The missing ticks are
fully accounted for the next time the handler finds that it can call into the
irq system.

The ubd request routine runs with interrupts off, but now, the alarms could at
least fire, even if they didn't do anything but increment a counter. This
occasionally caused the write of the I/O request from the request routine to
the I/O thread to return -EINTR. The return value wasn't checked, so the I/O
thread didn't get the request and the request routine had no idea that
anything was wrong. This caused disk I/O to permanently shut down, so pending
requests stayed pending, and processes waiting on them waited forever.

The key piece of this bug was a signal causing a crucial communication between
two UML threads to be lost. I don't see any analogies between this and
anything that happens on a physical system, so it looks to me like the problem
that people are seeing there is completely different.

Thanks to James Stevenson for figuring out how to reproduce the bug under UML,
and to Daniel Phillips and Al Viro for help in tracking this problem from the
fs and mm systems into the block and driver layers.

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/