Re: [prepatch] address_space-based writeback

Andrew Morton (akpm@zip.com.au)
Thu, 11 Apr 2002 16:03:23 -0700


Anton Altaparmakov wrote:
>
> ...
> It would be great to be able to submit variable size "io entities" even
> greater than PAGE_CACHE_SIZE (by giving a list of pages, starting offset
> in first page and total request size for example) and saying write that to
> the device starting at offset xyz. That would suit ntfs perfectly. (-:
>

Yes, I'll be implementing that. Writeback will hit the filesystem
via a_ops->writeback_mapping(mapping, nr_pages). The filesytem
will then call in to generic_writeback_mapping(), which will walk
the pages, assemble BIOs and send them off.

The filesystem needs to pass a little state structure into the
generic writeback function. An important part of that is a
semaphore which is held while writeback is locking multiple
pages. To avoid ab/ba deadlocks.

The current implementation of this bio-assembler is for no-buffer
(delalloc) fileystems only. It need to be enhanced (or forked)
to also cope with buffer-backed pages. It will need to peek
inside the buffer-heads to detect clean buffers (maybe. It
definitely needs to skip unmapped ones). When such a buffer
is encountered the BIO will be sent off and a new one will be started.
The code for this is going to be quite horrendous. I suspect
the comment/code ratio will exceed 1.0, which is a bad sign :)

One thing upon which I am undecided: for syncalloc filesystems
like ext2, do we attach buffers at ->readpages() time, or do
we just leave the page bufferless?

That's a hard call. It helps the common case, but in the uncommon
case (we overwrite the file after reading it), we need to run
get_block again.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/