Bug with shared memory.

Martin Schwidefsky (schwidefsky@de.ibm.com)
Tue, 14 May 2002 17:13:06 +0200


Hi,
we managed to hang the kernel with a db/2 stress test on s/390. The test
was done on 2.4.7 but the problem is present on all recent 2.4.x and 2.5.x
kernels (all architectures). In short a schedule is done while holding
the shm_lock of a shared memory segment. The system call that caused
this has been sys_ipc with IPC_RMID and from there the call chain is
as follows: sys_shmctl, shm_destroy, fput, dput, iput, truncate_inode_pages,
truncate_list_pages, schedule. The scheduler picked a process that called
sys_shmat. It tries to get the lock and hangs.

One way to fix this is to remove the schedule call from truncate_list_pages:

--- linux-2.5/mm/filemap.c~ Tue May 14 17:04:14 2002
+++ linux-2.5/mm/filemap.c Tue May 14 17:04:33 2002
@@ -237,11 +237,6 @@

page_cache_release(page);

- if (need_resched()) {
- __set_current_state(TASK_RUNNING);
- schedule();
- }
-
write_lock(&mapping->page_lock);
goto restart;
}

Another way is to free the lock before calling fput in shm_destroy but the
comment says that this functions has to be called with shp and shm_ids.sem
locked. Comments?

blue skies,
Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: schwidefsky@de.ibm.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/