asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec
*rmtp)
{
	struct timespec t;
	unsigned long expire;
+	struct pt_regs * regs = (struct pt_regs *) &rqtp;
	if(copy_from_user(&t, rqtp, sizeof(struct timespec)))
		return -EFAULT;
	if (t.tv_nsec >= 1000000000L || t.tv_nsec < 0 || t.tv_sec < 0)
		return -EINVAL;
	if (t.tv_sec == 0 && t.tv_nsec <= 2000000L &&
	    current->policy != SCHED_OTHER)
	{
		/*
		 * Short delay requests up to 2 ms will be handled with
		 * high precision by a busy wait for all real-time processes.
		 *
		 * Its important on SMP not to do this holding locks.
		 */
		udelay((t.tv_nsec + 999) / 1000);
		return 0;
	}
	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
-	current->state = TASK_INTERRUPTIBLE;
-	expire = schedule_timeout(expire);
+	regs->eax = -EINTR;
+	do {
+               current->state = TASK_INTERRUPTIBLE;
+       } while((expire = schedule_timeout(expire)) &&
!do_signal(regs,NULL));
	if (expire) {
		if (rmtp) {
			jiffies_to_timespec(expire, &t);
			if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
				return -EFAULT;
		}
		return -EINTR;
	}
	return 0;
}
BUT, it turns out that do_signal() is in the "arch" code, and further
that different arch's have different calling sequences (and, of course,
pt_regs is different also).  This is the ONLY place in the kernel where
platform independent code needs to call do_signal() :(  There does not
seem to be a clean answer for this issue.  I suppose we could put
something like this in timer.c:
#ifndef _do_signal
#define _do_signal(a,b) 1
#endif
and then leave it to the platform code to define a uniform
_do_signal(a,b) interface.  This way each platform will work as it does
now and better once they define the macro.  If this is the path to take,
what should the parameters "a" & "b" be?  We need to cover all platforms
needs.
Another way to solve this issue is to move nano_sleep into the "arch"
signal.c file, but then each would have to change and things would be
broken (i.e. nano_sleep would not work) until the platform made the
move.  I suppose the current nano_sleep could stay in the kernel and
each platform could implement one in their area with a different name. 
When all were done, the current code could be deleted.
How should this be approached?
George
george anzinger wrote:
> 
> george anzinger wrote:
> >
> > Bruce Janson wrote:
> > >
> > > In article <20010814092849.E13892@pc8.lineo.fr>,
> > > christophe barbé  <christophe.barbe@lineo.fr> wrote:
> > > ..
> > > >Le lun, 13 aoû 2001 10:29:32, Bruce Janson a écrit :
> > > ..
> > > >>     The following program behaves incorrectly when traced:
> > > ..
> > > >Have you receive off-line answers?
> > > ..
> > >
> > > No, though I did receive an offline reply from someone who appeared
> > > to have misunderstood the post.  In case it wasn't clear, the problem
> > > is that the above program behaves differently when traced to how it
> > > behaves when not traced.  (I do realise that in general, under newer
> > > Unices, when not ignored, a SIGCHLD signal may accompany the death of
> > > a child.)
> > >
> > > >I guess that it's certainly more a strace issue and that it's perhaps
> > > ..
> > >
> > > It's not clear to me whether it is a kernel, glibc or strace bug, but
> > > it does appear to be a bug.
> >
> > I don't have the code for usleep() handy and the man page is not much
> > help, but here goes:
> >
> > I think strace is using ptrace() which causes signals to be redirected
> > to wake up the parent (strace in this case).  In particular, blocked
> > signals are no longer blocked.  What this means is that a.) SIG CHILD is
> > posted, b.) the signal, not being blocked, the child is wakened, c.)
> > ptrace returns to the parent, d) the parent does what ever and tells the
> > kernel (ptrace) to continue the child with the original mask, e.) the
> > signal code returns 0 with out delivering the signal to the child.
> > Looks good, right?  Wrong!  The wake up at (b) pulls the child out of
> > the timer queue so when signal returns, the sleep (I assume nano sleep
> > is actually used here) call returns with the remaining sleep time as a
> > value.
> >
> > This is an issue for debugging also (same ptrace...).  The fix is to fix
> > nano_sleep to match the standard which says it should only return on a
> > signal if the signal is delivered to the program (i.e. not on internal
> > "do nothing" signals).  Signal in the kernel returns 1 if it calls the
> > task and 0 otherwise, thus nano sleep might be changed as follows:
> >
> >         expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);
> >
> >         current->state = TASK_INTERRUPTIBLE;
> > -       expire = schedule_timeout(expire);
> > +       while (expire = schedule_timeout(expire) && !signal());
> Still not quite right.  regs needs to be dummied up (see sys_sigpause)
> and then:
> +       while (expire = schedule_timeout(expire) && !do_signal(regs,
> NULL));
> >
> >         if (expire) {
> >                 if (rmtp) {
> >                         jiffies_to_timespec(expire, &t);
> >                         if (copy_to_user(rmtp, &t, sizeof(struct timespec)))
> >                                 return -EFAULT;
> >                 }
> >                 return -EINTR;
> >         }
> >         return 0;
> >
> > This code is in ../kernel/timer.c
> >
> > Note that this assumes that nano_sleep() underlies usleep().  If
> > setitimer (via sleep() or otherwise) is used, the problem and fix is in
> > the library.  In that case, the code needs to notice that it was
> > awakened but the alarm handler was not called.  Still, with out the full
> > spec on usleep() it is not clear what it should do.
> >
> > In any case, this is a bug in nano_sleep(), where the spec is clear on
> > this point.
> >
> > George
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/