Well. There's no way in which we can get effective writeback against
200 spindles by relying on pdflush, so that daemon is mainly there
to permit background writeback under light-to-moderate loads.
Once things get heavy, the only sane approach is to use the actual
caller of write(2) as the resource for performing the writeback.
As we're currently doing, in balance_dirty[_pages](). But the
problem there is that in both 2.4 and 2.5, a caller to that function
can easily get stuck on the wrong queue, and bandwidth really suffers.
I've been working on changing 2.5 so that the write(2) caller no
longer performs a general "writeback of everything" - that caller
instead performs writeback specifically against the queue which
he just dirtied. Do this by using the address_space->backing_dev_info
as a key during a search across the superblocks and blockdev inodes.
That works quite well.
But there's still a problem where pdflush goes to writeback a queue
and fills it up, so the userspace program ends up blocking (due to
pdflush's activity) when it really should not. Still undecided about
what to do about that.
And yes, point taken on the LVM thing. If the chunk size is reasonably
small (a few megabytes) then we should normally get decent concurrency,
but there will be corner-cases.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to firstname.lastname@example.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/