> Hi,
>
>> So it's perfectly possible to observe the behavior you are seeing
>> if __libc_read() returns -1. Because then sz_read will acquire the
>> value -1, and the guard expresssion sz_read < sizeof(request) will yield
>> zero, terminating the loop.
>
> Argh!  That indeed seems to be the case, _no_ apparent kernel problem
> is involved, sorry for the false alarm.  read() 'just' returns -1 with
> errno becoming EINTR.  One should never code up a loop like that in
> the last minute (I only found the effect yesterday)..
>
>> Could you recode the test patch to eliminate these suspicions and re-test?
>
> The analysis was nevertheless correct, LinuxThreads gets out of synch
> with disastrous effects.  I was puzzled how read() could return -1,
> however it may make sense with a __pthread_sig_cancel pending in the
> manager thread due to a child exiting (?).  But then, why wasn't this
> possibility discovered before?
Because we never compiled with -DDEBUG so that the ASSERT was
triggered?
There's another such place (line 135) with __libc_read in
__pthread_manager:
  /* Synchronize debugging of the thread manager */
  n = __libc_read(reqfd, (char *)&request, sizeof(request));
  ASSERT(n == sizeof(request) && request.req_kind == REQ_DEBUG);
We should account for a return value of -1 here also, shouldn't we?
> Below is a corrected patch for glibc-2.2.4.  I've run fork-malloc with
> this for a couple of hours.
Is it still running? ;-)  That would be excellent!
Thanks a lot Wolfram,
Andreas
> Thanks,
> Wolfram.
>
> 2001-09-11  Wolfram Gloger  <wg@malloc.de>
>
> 	* manager.c (__pthread_manager): When reading from pipe. account
> 	for possible error return from read().
>
> --- linuxthreads/manager.c.orig	Mon Jul 23 19:54:13 2001
> +++ linuxthreads/manager.c	Tue Sep 11 00:47:48 2001
> @@ -150,8 +150,19 @@
>      }
>      /* Read and execute request */
>      if (n == 1 && (ufd.revents & POLLIN)) {
> -      n = __libc_read(reqfd, (char *)&request, sizeof(request));
> -      ASSERT(n == sizeof(request));
> +      int sz_read = 0;
> +
> +      while (sz_read < sizeof(request)) {
> +	n = __libc_read(reqfd, (char *)&request + sz_read,
> +			sizeof(request) - sz_read);
> +	if (n < 0) {
> +#ifdef DEBUG
> +	  char d[64];
> +	  sprintf(d, "*** read err %d\n", errno);
> +#endif
> +	} else
> +	  sz_read += n;
> +      }
>        switch(request.req_kind) {
>        case REQ_CREATE:
>          request.req_thread->p_retcode =
>
-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.inka.de
    http://www.suse.de/~aj
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/