Re: [PATCH][RFC] Signal-per-fd for RT signals

Dan Kegel (dank@kegel.com)
Fri, 14 Sep 2001 18:33:51 -0700


Vitaly Luban <vitaly@luban.org> wrote:
> Attached patch is an implementation of "signal-per-fd"
> enhancement to kernel RT signal mechanism, AFAIK first
> proposed by A. Chandra and D. Mosberger ...
> which should dramatically increase linux based network
> servers scalability.
> [ Patch lives at http://www.luban.org/GPL/gpl.html ]

I have been using variations on this patch while trying
to benchmark an FTP server at a load of 10000 simultaneous
sessions (at 1 kilobyte/sec each), and noticed a few issues:

1. If a SIGINT comes in, t->files may be null, so where
send_signal() says
if( (info->si_fd < files->max_fds) &&
it should say
if( files && (info->si_fd < files->max_fds) &&
otherwise there will be a null pointer oops.

2. If a signal has come in, and a reference to it is left
in filp->f_infoptr, and for some reason the signal is
removed from the queue without going through collect_signal(),
a stale pointer may be left in filp->f_infoptr, which could
cause a wild pointer oops. There are two places this can happen:
a. if send_signal() returns -EAGAIN because we're out of memory or queue space
b. if user sets the signal handler to SIG_IGN, triggering a call
to rm_sig_from_queue()

I have seen the above problems in the field in my version of the patch,
and written and tested fixes for them. (Ah, the joys of ksymoops.)

3. Any reference to t->files probably needs to be protected by
acquiring t->files->file_lock, else when the file table is
expanded, any filp in use will become stale.

I have seen this problem in my version of the patch, but have not yet tackled it.
Is there any good guidance out there for how the various spinlocks
interact? Documentation/spinlocks.txt and Documentation/DocBook/kernel-locking.tmpl
are the best I've seen so far, but they don't get into specifics about, say,
files->file_lock and task->sigmask_lock. Guess I'll just have to read the source.

Also, while I have verified that the patch significantly reduces
reliable signal queue usage, I have not yet been able to measure
a reduction in CPU time in a real app. Presumably the benefits
are in response time, which I am not set up to measure yet.

This is my first excursion into the kernel, so please be gentle.
- Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/