Re: Possible Idea with filesystem buffering.

Chris Mason (mason@suse.com)
Mon, 21 Jan 2002 16:53:07 -0500


On Monday, January 21, 2002 11:41:44 PM +0300 Hans Reiser
<reiser@namesys.com> wrote:

> I read this and it sounds like you are agreeing with me, which is
> confusing;-),

No, no, you're agreeing with me ;-)

> help me to understand what you mean by triggered. Do you
> mean VM sends pressure to the FS? Do you mean that VM understands what a
> transaction is? Is this that generic journaling layer trying to come
> alive as a piece of the VM? I am definitely confused.
>
The vm doesn't know what a transaction is. But, the vm might know that
a) this block is pinned by the FS for write ordering reasons
b) the cost of writing this block is X
c) calling page->somefunc will trigger writes on those blocks.

The cost could be in order of magnitude, the idea would be to give the FS
the chance to say 'one a scale of 1 to 10, writing this block will hurt
this much'. Some blocks might have negative costs, meaning they don't
depend on anything and help free others.

The same system can be used for transactions and delayed allocation,
without telling the VM about any specifics.

> I think what I need to understand, is do you see the VM as telling the FS
> when it has (too many dirty pages or too many clean pages) and letting
> the FS choose to commit a transaction if it wants to as its way of
> cleaning pages, or do you see the VM as telling the FS to commit a
> transaction?

I see the VM calling page->somefunc to flush that page, triggering whatever
events the FS feels are necessary. We might want some way to differentiate
between periodic writes and memory pressure, so the FS has the option of
doing fancier things during write throttling.

>
> If you think that VM should tell the FS when it has too many pages, does
> that mean that the VM understands that a particular page in the subcache
> has not been accessed recently enough? Is that the pivot point of our
> disagreement?

Pretty much. I don't think the VM should say 'you have too many pages', I
think it should say 'free this page'.

>>
>>
>> For write clustering, we could add an int clusterpage(struct page *p)
>> address space op that allow the FS to find pages close to p, or the FS
>> could choose to cluster in its own writepage func.
>>
> What you are proposing is not consistent with how Marcello is doing write
> clustering as part of the VM, you understand that, yes? What Marcello is
> doing is fine for ReiserFS V3 but won't work well for v4, do you agree?

Well, my only point is that it is possible to make an interface for write
clustering that gives the FS the freedom to do what it needs, but still
keep the intelligence about which pages need freeing first in the VM.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/