Re: need help interpreting 'free' output.

Andrea Arcangeli (andrea@suse.de)
Tue, 30 Oct 2001 21:05:53 +0100


On Tue, Oct 30, 2001 at 11:21:46AM -0800, Linus Torvalds wrote:
>
> On Tue, 30 Oct 2001, Andrea Arcangeli wrote:
> >
> > So in short we only need to replace the lock_page with a TryLockPage
> > (plus your wait_on_page if page is not uptodate to catch the major
> > faults) and here we go, faster than pre5.
>
> Wrong.
>
> If _anybody_ accesses the page unlocked, you cannot do the swap_count() at
> all, because then you don't have anything that serializes the accesses to
> swap_count vs page_count any more.

incidentally if trylock fails do_wp_page doesn't even try to check the
swap count, it just lefts the swap cache there. same thing do_swap_page
can do at the early-cow stage. this is the only point I'm making.

and as said if you want to do any remove_exclusive_swap_page() in
do_swap_page as you claimed in earlier email you also need to get the
page lock.

As far I can tell here the magic key is "trylock" and nothing else, it's
not that the remove_exclusive_swap_page or the avoidance of the
early-cow per se can make any difference (let's ignore swapoff) except
running slower, here the only improvement during swapout load is that
you're delegating the work of remove_exclusive_swap_page to do_wp_page
that will do a trylock instead of a lock_page as far I can tell. This is
the only point I'm making.

Go ahead and implement this thing in do_swap_page:

remove = 0;
if ((vm_swap_full() && (remove = exclusive_swap_cache_delete())) ||
only_swap_user()) {
pte = mk_pte(page, vma->vm_page_prot);
if (remove || write_access)
pte = pte_mkdirty(pte);
if (vma->vm_page_prot & VM_WRITE)
pte = pte_mkwrite(pte);
install_pte();
return;
}

and you'll find yourself grabbing the page lock somehow first in the
do_swap_page path, or exclusive_swap_cache_delete will obviously BUG()
on you.

This is why I'm saying the real magic is to conver the lock_page of pre4
in a TryLockPage, all other changes are not interesting in real load and
I obviously agree that's very good idea to fix the minor faults, that
in pre4 (and all previous kernels including all -ac and -aa) are running
as slow as major faults!

Now about the real need of exclusive_swap_cache_delete compared to
exclusive_swap_page I need to think a little more about it to be sure.

In sort previously we run exclusive_swap_page only with the page lock,
page->buffers is constant if the page is locked. And swap count and page
count _can't_ increase under us if the page happen to be exclusive once.
This was the previous rule at least, but as usual there's the swapoff
evil caming out and doing the lookup on a exclusive swap page... Hugh
may provide more hints on this case.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/