Re: DVD blockdevice buffers

Stephen C. Tweedie (sct@redhat.com)
Wed, 23 May 2001 20:57:48 +0100


Hi,

On Wed, May 23, 2001 at 11:12:00AM -0700, Linus Torvalds wrote:
>
> On Wed, 23 May 2001, Stephen C. Tweedie wrote:
> No, you can actually do all the "prepare_write()"/"commit_write()" stuff
> that the filesystems already do. And you can do it a lot _better_ than the
> current buffer-cache-based approach. Done right, you can actually do all
> IO in page-sized chunks, BUT fall down on sector-sized things for the
> cases where you want to.

Right, but you still lose the caching in that case. The write works,
but the "cache" becomes nothing more than a buffer.

This actually came up recently after the first posting of the
bdev-on-pagecache patches, when somebody was getting lousy
database performance for an application I think they had developed
from scratch --- it was using 512-byte blocks as the basic write
alignment and was relying on the kernel caching that. In fact, in
that case even our old buffer cache was failing due to the default
blocksize of 1024 bytes, and he had had to add an ioctl to force the
blocksize to 512 bytes before the application would perform at all
well on Linux.

So we do have at least one real-world example which will fail if we
increase the IO granularity. We may well decide that the pain is
worth it, but the page cache really cannot deal properly with this
right now without having an uptodate labeling at finer granularity
than the page (which would be unnecessary ugliness in most cases).

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/