Re: [patch] exit_free(), 2.5.31-A0

mgross (mgross@unix-os.sc.intel.com)
Thu, 15 Aug 2002 16:51:35 -0400


On Tuesday 13 August 2002 02:17 pm, Linus Torvalds wrote:
> On Tue, 13 Aug 2002, Ingo Molnar wrote:
> > > That still doesn't make it any les crap: because any thread that exits
> > > without calling the "magic exit-flag interface" will then silently be
> > > lost, with no information left around anywhere.
> >
> > that should be a pretty rare occurance: with the upcoming signals patch
> > any segmentation fault zaps all threads and does a proper (and
> > deadlock-free) multithreaded coredump.
>
> That still doesn't change the fact that the interface is broken
> _by_design_.
>
> If the parent wants to get notified on child death, it should damn well
> get notified on child death. Not "in case the child exists politely".
>
> We don't depend on processes calling "exit()" to clean up all the stuff
> they left behind. The VM gets cleaned up even for bad processes.
>

What's been missing is the there is no option for synchronization as part of
the parent notification or exit processing.

What's needed, for TCore, is something like "when a process dies, signal
parents to do X and then wait on optional semaphore" before losing all the
process data to do_exit.

The big problem I've been had with TCore is that that there is no polite way
of synchronization for the dumping process with the parent and sibling
processes while they are getting a signal storm as they all go down.

I've come up with a few brutish ways to suspend these processes (with some
risk taken with semaphore locks from a maintenance point of view ;).
However; these approaches still sometimes don't get to run before some of the
siblings exit.

There is no good way, especially on SMP setups with a large multi-threaded
applications, to guarantee the signals don't get where they are going before
the core dump data is gathered.

--mgross

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/