Re: linux-2.4.10-pre5

Linus Torvalds (torvalds@transmeta.com)
Sat, 8 Sep 2001 21:54:54 -0700 (PDT)


On Sat, 8 Sep 2001, Andreas Dilger wrote:
>
> So basically - when we move block devices to the page cache, get rid of
> buffer cache usage in the filesystems as well? Ext2 is nearly there at
> least. One alternative is as Daniel Phillips did in the indexed-ext2-
> directory patch, where he kept the "bread" interface, but backed it
> with the page cache, so it required relatively little change to the
> filesystem.

This might be a really easy solution. We might just make sure that the
buffer manipulation interfaces we export to filesystems (and there aren't
actually all that many of them - it's mainly bread and getblk) always end
up using the page cache, and just return the buffer head that is embedded
inside the page cache.

That way we don't have any new aliasing issues _at_all_. The user-mode
accesses to the block devices would always end up using the same buffers
that the low-level filesystem does.

Hmm.. That actually would have another major advantage too: the whole
notion of a "anonymous buffer page" would just disappear. Because there
would be no interfaces to even create them - buffer pages would always be
associated with a mapping.

Andrea(s) - interested in pursuing this particular approach? In fact,
since "bread()" uses "getblk()", it is almost sufficient to just make
getblk() use the page cache, and the rest will follow... You can even get
rid of the buffer hash etc, and make the buffer head noticeably smaller.

[ Yeah, I'm being a bit optimistic - you also end up having to re-write
"get_hash_table()" to use a page cache lookup etc. So it's definitely
some major surgery in fs/buffer.c, but "major" might actually be just a
couple of hundred lines ]

The good news here is that once it works (and you've destroyed your
filesystem about fifty times debugging it :), it's pretty much guaranteed
not to introduce any new and "interesting" interactions between
filesystems and user-level programs accessing the device.

And no filesystem should ever notice. They can still access the buffer
head as if it was just a buffer head, and wouldn't care about the fact
that it happens to be part of a mapping.

Any pitfalls?

[ I can see at least one already: __invalidate_buffers() and
set_blocksize() would both have to be re-done, probably along the lines
of "invalidate_inode_pages()" and "fsync+truncate_inode_pages()"
respectively. ]

Comments?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/