Re: Multithreaded core dumps (CLONE_THREAD and elf)

Terje Eggestad (terje.eggestad@scali.no)
30 Aug 2001 10:16:34 +0200


Annyoing isn't it....

I just browsed thru the code, got curious myself since Linux has a very
flexible way of specifying the degree of sharing. If two procs share VM,
it is not necessarily said that both shall exit if one do.

THere is a CLONE_THREAD flag to clone() that sets up a linked list thru
all the procs (shared VM or not) that in 2.4.3 that I currently run
don't seem to be ready for general use, managed to get this:
te 31504 31504 0 10:03 pts/0 00:00:00 [clone2 <defunct>]
te 31505 31504 0 10:03 pts/0 00:00:00 [clone2 <defunct>]

Where a zombie is waiting for the parent to receive it's SIGCHLD, but
it's its own parent. Kinda cute, guess Its time to reboot....

I don't know what kind of behavour it's supposed to have, but if it's
intended to to mark all the clones to die with the coredumping proc, the
data structures are properly in place, and all do_coredump need to do
is to follow the linked list (thread_group), prempth if SMP, and read of
the regs. (plus all the pitfall I don't see of course).

A COMPLETELY different matter is weither the elf format support multiple
registers sets.... look it up, (I'm not...)

TJ

Den 30 Aug 2001 00:21:06 -0500, skrev Elan Feingold:
> Hi,
>
> First post (although lurking on and off since 0.99pl14 or so), so please
> be gentle.
>
> We recently convinced my company to move from VxWorks to (embedded)
> Linux. For better or worse, our application is heavily multithreaded.
> It seems that current versions of Linux dump single-threaded core. I've
> done a bit of research and it seems that:
>
> - GDB is more than happy to load multiple register sets per core file.
>
> - There are patches to the kernel dump multiple core files, one per LWP.
> This is not really helpful in my case, since we're on an embedded
> platform with little Flash to store cores. Besides, loading up gdb on
> 10-20 core files looking for the one that had the problem doesn't sound
> very fun, as opposed to saying "info threads".
>
> Because I am (not so) young and (very) foolish, it doesn't sound that
> hard, at first (and uneducated) glance, to dump register sets for all
> related LWPs to a single core file, even in a portable way across x86
> and PPC architectures. Under SMP, it might be a bit trickier, but
> Solaris manages to do it, so it can't be impossible, and capturing any
> state for each thread would seem better than nothing at all, since all
> the LWPs are about to die anyways. I'm considering using some of my
> (copious) spare time and trying to patch the kernel to do this. I have
> a few questions:
>
> 0. Am I wrong or confused about the state of postmortem multithreaded
> debugging under Linux?
>
> 1. Is anyone else working on this? If not, why not? Am I missing
> something?
>
> 2. If this is simply something that nobody is working on because other
> things are more interesting, can anybody give me a few pointers on where
> to start?
>
> Best regards,
>
> -elan
> Aspiring Kernel Hacker
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

-- 
_________________________________________________________________________

Terje Eggestad terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com

Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 70 Bogerud +47 975 31 574 (MOBILE) N-0621 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/