Re: SMP race in munmap/ftruncate paths

Philip R. Auld (prauld@egenera.com)
Fri, 21 Dec 2001 10:28:26 -0500


Hi again,

"Philip R. Auld" wrote:
>
> Hi,
>
> It looks like there is an SMP locking problem and race
> the munmap and truncate paths. I've hit this problem using a tree
> based on the redhat 2.4.7-10 kernel, which in turn is 2.4.7-ac3 plus
> some other patches. It looks like 2.4.16 still has these issues although
> this code has changed a bit.

[stuff snipped]

As a follow-up to this I noticed that the lock hierarchy in this
area is broken as well. This problem _has_ been fixed in later kernels
(at least as of 2.4.16). I'm just offering this up for completeness and in
case someone else is not it a position to switch code bases at will.

truncate_list_pages is called with the mapping->page-lock held and
since it violates the ordering it does a try_lock on the pagecache_lock.

mm/filemap.c
375 pg_lock = PAGECACHE_LOCK(page);
376
377 if (!spin_trylock(pg_lock)) {
378 return 1;
379 }
380

But since the the caller (truncate_inode_pages) doesn't release the
mapping->pagelock between calls the try_lock is useless. We can still
dead-lock. The patch I sent yesterday to bandaid (I won't say fix because
I don't think it solves the real problem...) the race I was seeing
exacerbates this. One solution is to move the mapping lock acquisition
into truncate_list_pages:

Index: mm/filemap.c
===================================================================
RCS file: /build/vault/linux2.4/mm/filemap.c,v
retrieving revision 1.5
diff -u -r1.5 filemap.c
--- mm/filemap.c 2001/10/19 03:48:00 1.5
+++ mm/filemap.c 2001/12/21 14:59:23
@@ -361,6 +366,7 @@
struct list_head *curr;
struct page * page;

+ spin_lock(&mapping->page_lock);
curr = head->next;
while (curr != head) {
unsigned long offset;
@@ -375,6 +381,7 @@
pg_lock = PAGECACHE_LOCK(page);

if (!spin_trylock(pg_lock)) {
+ spin_unlock(&mapping->page_lock);
return 1;
}

@@ -402,10 +409,10 @@
}
curr = curr->next;
}
+ spin_unlock(&mapping->page_lock);
return 0;
out_restart:
page_cache_release(page);
- spin_lock(&mapping->page_lock);
return 1;
}

@@ -425,7 +432,6 @@
unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
int complete;

- spin_lock(&mapping->page_lock);
do {
complete = 1;
while (truncate_list_pages(mapping,&mapping->clean_pages, start, &partial))
@@ -436,7 +442,6 @@
complete = 0;
} while (!complete);
/* Traversed all three lists without dropping the lock */
- spin_unlock(&mapping->page_lock);
}

/*

------------------------------------------------------
Philip R. Auld, Ph.D. Technical Staff
Egenera Corp. pauld@egenera.com
165 Forest St, Marlboro, MA 01752 (508)786-9444
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/