Re: Bug with shared memory.

Andrew Morton (akpm@zip.com.au)
Wed, 15 May 2002 16:07:16 -0700


Mike Kravetz wrote:
>
> On Tue, May 14, 2002 at 12:33:23PM -0700, Andrew Morton wrote:
> > Martin Schwidefsky wrote:
> > > The system call that caused
> > > this has been sys_ipc with IPC_RMID and from there the call chain is
> > > as follows: sys_shmctl, shm_destroy, fput, dput, iput, truncate_inode_pages,
> > > truncate_list_pages, schedule. The scheduler picked a process that called
> > > sys_shmat. It tries to get the lock and hangs.
> >
> > There's no way the kernel can successfully hold a spinlock
> > across that call chain.
> >
>
> Is adding a check for this type of situation (under CONFIG_DEBUG_SPINLOCK
> of course) worth the effort? One would simply add a 'locks_held' count
> for each task and check for zero at certain places such as return to
> user mode, and during context switching.

I think it would be worth the effort. One approach would be to
create a `can_sleep()' macro. Add that to functions which may
schedule. It's useful for documentation purposes as well as runtime
checks.

The Stanford checker caught a lot of these, but it seems that
the (high) amount of source-level obfuscation in the ipc code
defeated it.

> One would think these types of things are easily found, but this example
> suggests otherwise. Has anyone run the kernel through an extensive
> (stress) test suite with any of the kernel debug options enabled?

There are at present no tools in the tree to trap this
problem.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/