Re: SMP-ix86-threads-fork: Linux 2.4.x kernel problem identified

Wolfram Gloger (wg@malloc.de)
Tue, 11 Sep 2001 00:56:31 +0200


Hi,

> So it's perfectly possible to observe the behavior you are seeing
> if __libc_read() returns -1. Because then sz_read will acquire the
> value -1, and the guard expresssion sz_read < sizeof(request) will yield
> zero, terminating the loop.

Argh! That indeed seems to be the case, _no_ apparent kernel problem
is involved, sorry for the false alarm. read() 'just' returns -1 with
errno becoming EINTR. One should never code up a loop like that in
the last minute (I only found the effect yesterday)..

> Could you recode the test patch to eliminate these suspicions and re-test?

The analysis was nevertheless correct, LinuxThreads gets out of synch
with disastrous effects. I was puzzled how read() could return -1,
however it may make sense with a __pthread_sig_cancel pending in the
manager thread due to a child exiting (?). But then, why wasn't this
possibility discovered before?

Below is a corrected patch for glibc-2.2.4. I've run fork-malloc with
this for a couple of hours.

Thanks,
Wolfram.

2001-09-11 Wolfram Gloger <wg@malloc.de>

* manager.c (__pthread_manager): When reading from pipe. account
for possible error return from read().

--- linuxthreads/manager.c.orig Mon Jul 23 19:54:13 2001
+++ linuxthreads/manager.c Tue Sep 11 00:47:48 2001
@@ -150,8 +150,19 @@
}
/* Read and execute request */
if (n == 1 && (ufd.revents & POLLIN)) {
- n = __libc_read(reqfd, (char *)&request, sizeof(request));
- ASSERT(n == sizeof(request));
+ int sz_read = 0;
+
+ while (sz_read < sizeof(request)) {
+ n = __libc_read(reqfd, (char *)&request + sz_read,
+ sizeof(request) - sz_read);
+ if (n < 0) {
+#ifdef DEBUG
+ char d[64];
+ sprintf(d, "*** read err %d\n", errno);
+#endif
+ } else
+ sz_read += n;
+ }
switch(request.req_kind) {
case REQ_CREATE:
request.req_thread->p_retcode =

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/