> I have a question about:
>
> > struct epoll_fd {
> > 	int fd;
> > 	unsigned short events;
> > 	unsigned short revents;
> > 	__uint64_t obj;
> > };
>
> What value does the `fd' field have when a file descriptor being
> polled has been renumbered (by dup/close or dup2/close or
> fcntl(F_DUPFD)/close or passing through a unix domain socket)?
>
> If we are honest, the `obj' field is absolutely essential as its the
> only value which uniquely identifies the file descriptor if you have
> done anything unusual with the fds.
>
> The `fd' field, on the other hand, is not guaranteed to correspond
> with the correct file descriptor number.  So.... perhaps the structure
> should contain an `obj' field and _no_ `fd' field?
>
> This doesn't affect applications.  Those which use `obj' for something
> interesting (i.e. a pointer) will have the `fd' value stored in the
> pointed-to data structure, while simple applications can just store
> the original `fd' value in `obj' in the first place.
Even if I agree with you here, this will make the API asymmetrical. We
will have :
struct epoll_fd {
	unsigned short events;
	unsigned short revents;
	__uint64_t obj;
};
int epoll_ctl(int epfd, int op, int fd, struct epoll_fd *pfd);
Where the "fd" is used only for EPOLL_CTL_ADD, and "obj" for EPOLL_CTL_DEL
and EPOLL_CTL_MOD.
> > It'll be possible to add epfd1 inside epfd2, not epfd1 inside epfd1.
>
> Beware of overflowing the kernel stack.  If epfd4 becomes readable,
> and wakes up epfd3, which wakes up epfd2, which wakes up epfd1...  If
> that is implemented recursively than I can write malicious code which
> will crash the kernel.  Note that this isn't a cycle.  It's possible
> to code the wakeups so this cannot happen but still have the expected
> behaviour.
>
> A circular arrangement should be fine, if silly.  The semantics are
> quite logical and don't require special cases: epfd2 becoming readable
> will trigger epfd1 to become readable if it isn't already.  If you
> make a cycle, that's silly but still behaves as you'd expect.  If
> epfd1 becomes readable, it wakes up epfd1...  which is already
> readable so nothing further happens.  Similarly with larger cycles.
> Assuming you've avoided stack overflow for acyclic graphs, there won't
> be any problem with cyclic ones.
This is a problem with the new callback'd wake_up(). I'd be tempted to not
permit epoll fd inclusion inside other epoll fds.
- Davide
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/