fork then wait4 bug when being traced?

Chris Wedgwood (cw@f00f.org)
Sun, 2 Jul 2000 03:24:09 +1200


It appears that when a process is being traced, there is a race or
sorts that can occur which causes wait4 to fail erroneously.
Basically, you can fork and then wait4 on the child yet wait4 returns
ECHILD -- something you would not normally expected.

It seems, I can only replicate this when tracing the process and only
some times (50% or so) as it's appears to be fairly timing critical.
Linux 2.4.0-test3-pre2+some-bits UP.

Here is a trace that shows it happening:

9839 fork() = 9840
9839 close(4) = 0
9839 fstat(5, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
9839 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000
9839 _llseek(5, 0, 0xbfffea24, SEEK_CUR) = -1 ESPIPE (Illegal seek)
9839 write(5, "To: cw\n\nDecoded Message:\n\n", 26) = 26
9839 close(5) = 0
9839 wait4(9840, 0xbfffec68, WNOHANG, NULL) = -1 ECHILD (No child processes)

--cw

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/