Re: Synchronous signal delivery..

Jamie Lokier (jamie@shareable.org)
Fri, 14 Feb 2003 02:40:46 +0000


Linus Torvalds wrote:
> The above example program is obviously totally useless, and any real use
> would have to expand the implementation with addign the full siginfo to
> the packet read (which is trivial apart from deciding on what format to
> use - it would be good to not have it be architecture-dependent and in
> particular it would be horrible to have different formats for different
> compatibility layers).

siginfo is _almost_ architecture independent now. There are some
quirks. siginfo is bad though, because it has to be compatible with
whatever standard it came from, so nobody dares to extend it. (See
dnotify and sigsegv below).

I see that there are several fairly general event sources now:

- signals
- epoll events
- async I/O events
- posix timers

More events that don't provide enough information:

- dnotify details (siginfo doesn't say enough)
- sigsegv read/write? (siginfo doesn't say enough)

More events that should be accessible but aren't:

- vm paging like crazy, please release some memory

Your synchronous signals code effectively makes signals work with
select/poll/epoll nicely. Async I/O still reports events in its own
way.

Suggestion du jour
------------------

I think that epoll and sigfd and async I/O could meld into a nice and
extensible, and fast, event interface. You'd first create an "event
fd" (i.e. very similar to epoll_create or sigfd), then you'd add event
sets of interest (which include signals and file descriptors), then
you'd wait for them using read().

All event sources (such as timers) would have a facility for adding
themselves to a given sigfd.

Such fds could be passed around in the usual ways (as you described
for sigfds).

Async I/Os would be queued with reference to a sigfd, instead of an
aio descriptor (which is just an integer anyway) as they are now.

read() would report return one or more structure containing pending
events, in the format: LENGTH, CATEGORY, REST... with the first word
saying which category of events (and giving the format), the next
giving the length, and the rest in much the same format as epoll or
async I/O do now, and whatever format is appropriate for signals.

Davide has already worked out the tricky logic of attaching fd-poll
events to an event reporter, although a special system call is used to
get the events instead of just read(). The async I/O folk have
already done similar for async I/Os, except a different special system
call is used. But both basically return an array of bytes containing
event structures.

In summary:

sigfd(...) // Create an event reporter fd.
sigfd_sigmask(fd,...) // Attach a signal mask to the reporter.
epoll_ctl(fd,...) // Attach an fd watcher to the reporter.
fcntl(...DNOTIFY_*,fd) // Attach a directory watcher to the reporter.
futex_event(...,fd) // Attach a futex waiter to the reporter.

My main reason for wanting a little unity, rather than lots of
different event reporter types (i.e. epoll, aio, sigfd) as now is
mainly that async I/O doesn't fit at all because it doesn't use an fd
to report its events. Anything fd-based can obviously be monitored
using epoll (or poll/select).

Obviously, all that's needed there is for async I/O to report _its_
events using a file descriptor and read().

However, once you've got a kernel data structure for queuing events
(epoll's logic is close to ideal IMHO), it seems clean and efficient
to generalise the code, rather than having several different event
queuing and reporting subsystems.

And when that's done you have some nice bonuses:

- All event types are reported equally fast, and in a single
system call (read()).

- The order in which events occurred is preserved.
(This is lost when you have to scan multiple queues).

- Hierarchies of event sets of any kind are possible.
(epoll has solved the logical problems of this already).

- Less code duplicated.

- Adding new kinds of kernel events becomes _very_ simple.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/