Re: need help interpreting 'free' output.

Linus Torvalds (torvalds@transmeta.com)
Tue, 30 Oct 2001 09:53:28 -0800 (PST)


On Tue, 30 Oct 2001, Andrea Arcangeli wrote:
>
> On Tue, Oct 30, 2001 at 09:28:29AM -0800, Linus Torvalds wrote:
> > Does anybody see why we have to remove it from the swap cache at all?
>
> the only reason is to avoid wasting the swap space, so at least Rik's
> vm_swap_full logic should be added to it.

I agree, but that's true both for reads and writes, and then we want to
delete it. So the logic might be something like

remove = 0;
if ((vm_swap_full() && (remove = exclusive_swap_cache_delete())) ||
only_swap_user()) {
pte = mk_pte(page, vma->vm_page_prot);
if (remove || write_access)
pte = pte_mkdirty(pte);
if (vma->vm_page_prot & VM_WRITE)
pte = pte_mkwrite(pte);
install_pte();
return;
}

ie we _remove_ it if we're low on swap entries and it is exclusive (that
doesn't really save memory, but it allows us to re-use the swap entries
for "better" pages), and we just re-use it without removing it if we're
the only users (it doesn't even have to be a write access - we can do it
even for reads, as if we're the only user we might as well just give the
page to the process anyway - and let fork() do the thing it does in any
case.

Then we'll just trust the dirty bit when shared, like we always have done
before anyway (we need to set it on removal, and we want to set it early
on a write access to avoid unnecessary faults on architectures which do
the dirty bit in software - that's why we have the "remove ||
write_access" test there.

> The only advantage of dirty swap cache persistence is that it will
> maintain the same position on disk across a swapin/swapout cycle.

Well, the _big_ advantage is not the persistence, but the fact that the
page might be in-flight when the user wants to use it, and the swap cache
is just busy. Right now we _wait_ for the write to complete, which is
silly. We might as well just let the user start using the page (including
writing more stuff to it), and later on write it again.

So right now the "remove from swap cache" is actually a IO-serializing
operation, and we're doing it for no really good reason.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/