Re: Race in pagevec code

Andrew Morton (akpm@zip.com.au)
Wed, 21 Aug 2002 15:52:35 -0700


Christian Ehrhardt wrote:
>
> On Wed, Aug 21, 2002 at 10:41:48AM -0700, Andrew Morton wrote:
> > Christian Ehrhardt wrote:
> > >
> > > ...
> > > Both processors succeeded in bringing the page_count to zero,
> > > i.e. both processors will add the page to their own
> > > pages_to_free_list.
> >
> > This is why __pagevec_release() has the refcount check inside the lock.
> > If someone else grabbed a ref to the page (also inside the lock) via
> > the LRU, __pagevec_release doesn't free it.
>
> I saw this check but this doesn't help. There is no guarantee that this
> other reference that someone grabbed is still beeing held at the time
> where we do the check:
> The problem is if this newly grabbed reference is again dropped BEFORE
> the check for page_count == 0 but AFTER put_page_test_zero. In this
> case there can be TWO execution paths the BOTH think that they dropped
> the last reference, i.e. both call __free_pages_ok for the same page.
> See?

shrink_cache() detects that, inside the lock, and puts the page back
if it has a zero refcount.

refill_inactive doesn't need to do that, because it calls page_cache_release(),
which should look like this:

void __page_cache_release(struct page *page)
{
unsigned long flags;

spin_lock_irqsave(&_pagemap_lru_lock, flags);
if (TestClearPageLRU(page)) {
if (PageActive(page))
del_page_from_active_list(page);
else
del_page_from_inactive_list(page);
}
if (page_count(page) != 0)
page = NULL;
spin_unlock_irqrestore(&_pagemap_lru_lock, flags);
if (page)
__free_pages_ok(page, 0);
}

If the page count and non-LRUness are both seen inside the lock,
the page is freeable.

We do a similar thing with inodes, via atomic_dec_and_lock.

Despite the transformations, it's based on the 2.4 approach. But you've
successfully worried me, and I'm not really sure it's right, and I'm
dead sure it's too hairy. Something simpler-but-not-sucky is needed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/