Re: MM patches against 2.5.31

Daniel Phillips (phillips@arcor.de)
Tue, 27 Aug 2002 18:48:50 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Peter T. Breuer: "Re: block device/VM question"
Previous message: Christoph Hellwig: "Re: Loop devices under NTFS"
Next in thread: Christian Ehrhardt: "Re: MM patches against 2.5.31"
Reply: Christian Ehrhardt: "Re: MM patches against 2.5.31"

On Monday 26 August 2002 22:58, Christian Ehrhardt wrote:
> On Mon, Aug 26, 2002 at 10:09:38PM +0200, Daniel Phillips wrote:
> > On Monday 26 August 2002 22:00, Christian Ehrhardt wrote:
> > > On Mon, Aug 26, 2002 at 07:56:52PM +0200, Daniel Phillips wrote:
> > > > On Monday 26 August 2002 17:29, Christian Ehrhardt wrote:
> > > > > On Mon, Aug 26, 2002 at 04:22:50PM +0200, Daniel Phillips wrote:
> > > > > > On Monday 26 August 2002 11:10, Christian Ehrhardt wrote:
> > > > > > > + * A special Problem is the lru lists. Presence on one of these lists
> > > > > > > + * does not increase the page count.
> > > > > >
> > > > > > Please remind me... why should it not?
> > > > >
> > > > > Pages that are only on the lru but not reference by anyone are of no
> > > > > use and we want to free them immediatly. If we leave them on the lru
> > > > > list with a page count of 1, someone else will have to walk the lru
> > > > > list and remove pages that are only on the lru.
> > > >
> > > > I don't understand this argument. Suppose lru list membership is worth a
> > > > page count of one. Then anyone who finds a page by way of the lru list can
> > >
> > > This does fix the double free problem but think of a typical anonymous
> > > page at exit. The page is on the lru list and there is one reference held
> > > by the pte. According to your scheme the pte reference would be freed
> > > (obviously due to the exit) but the page would remain on the lru list.
> > > However, there is no point in leaving the page on the lru list at all.
> >
> > If you want the page off the lru list at that point (which you probably do)
> > then you take the lru lock and put_page_testzero.
>
> Could you clarify what you mean with "at that point"? Especially how
> do you plan to test for "this point". Besides it is illegal to use
> the page after put_page_testzero (unless put_page_testzero returns true).

> > > If you think about who is going to remove the page from the lru you'll
> > > see the problem.
> >
> > Nope, still don't see it. Whoever hits put_page_testzero frees the page,
> > secure in the knowlege that there are no other references to it.
>
> Well yes, but we cannot remove the page from the lru atomatically
> at page_cache_release time if we follow your proposal. If you think we can,
> show me your implementation of page_cache_release and I'll show
> you where the races are (unless you do everything under the lru_lock
> of course).

void page_cache_release(struct page *page)
{
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) {
__lru_cache_del(page);
atomic_dec(&page->count);
}
spin_unlock(&pagemap_lru_lock);
if (put_page_testzero(page))
__free_pages_ok(page, 0);
}

This allows the following benign race, with initial page count = 3:

spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) /* false */
spin_unlock(&pagemap_lru_lock);
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) /* false */
spin_unlock(&pagemap_lru_lock);
if (put_page_testzero(page))
__free_pages_ok(page, 0);
if (put_page_testzero(page))
__free_pages_ok(page, 0);

Neither holder of a page reference sees the count at 2, and so the page
is left on the lru with count = 1. This won't happen often and such
pages will be recovered from the cold end of the list in due course.

The important question is: can this code ever remove a page from the lru
erroneously, leaving somebody holding a reference to a non-lru page? In
other words, can the test PageLRU(page) && page_count(page) == 2 return
a false positive? Well, when this test is true we can account for both
both references: the one we own, and the one the lru list owns. Since
we hold the lru lock, the latter won't change. Nobody else has the
right to increment the page count, since they must inherit that right
from somebody who holds a reference, and there are none.

We could also do this:

void page_cache_release(struct page *page)
{
if (page_count(page) == 2) {
spin_lock(&pagemap_lru_lock);
if (PageLRU(page) && page_count(page) == 2) {
__lru_cache_del(page);
atomic_dec(&page->count);
}
spin_unlock(&pagemap_lru_lock);
}
if (put_page_testzero(page))
__free_pages_ok(page, 0);
}

Which avoids taking the lru lock sometimes in exchange for widening the
hole through which pages can end up with count = 1 on the lru list.

Let's run this through your race detector and see what happens.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Peter T. Breuer: "Re: block device/VM question"
Previous message: Christoph Hellwig: "Re: Loop devices under NTFS"
Next in thread: Christian Ehrhardt: "Re: MM patches against 2.5.31"
Reply: Christian Ehrhardt: "Re: MM patches against 2.5.31"