Re: How does the disk buffer cache work?

Andrew Morton (akpm@digeo.com)
Mon, 30 Dec 2002 17:24:23 -0800


Matthew Zahorik wrote:
>
> Earlier I wrote to the list where my SS10 hung on the partition check
> if a bad disk was installed.
>
> This behavior is new to the 2.4.20 kernel. I previously ran 2.2.20 on the
> machine. (the default in a Debian 3.0r0 install) I can't vouch for 2.4
> kernels previous to 2.4.20.
>
> I have traced the problem to a hang in the one of the disk buffer caches.
>
> Can anyone tell me how to correct the behavior so that I:
>
> 1. Don't break things for other parts of the kernel
> 2. The disk cache will return with an error for a hung disk?
>
> Here's the tail of the console with debugging printk's inserted:
>
> ...
> [.. the next function call in read_cache_page() is lock_page(), which we
> hang forever on ..]

lock_page() will sleep until the page is unlocked. The page is unlocked
from end_buffer_io_sync(), which is called from within the context of
the disk device driver's interrupt handler.

This is probably a device driver or interrupt routing problem: the disk
controller hardware interrupts are not making it through to the CPU.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/