RE: CLOSE_WAIT bug?

David Schwartz (davids@webmaster.com)
Tue, 12 Jun 2001 19:07:09 -0700


> One server socket created and listening, blocking on select(),
> once a client connect to that port, there is another thread in server
> side issues a close() to the new connection.
> After the client close the connection. The connection in server side will
> stuck on CLOSE_WAIT forever until the program being killed.

This isn't something you should ever do. There is no way the kernel can
guarantee a sane reaction to this since the 'close' could occur _before_ you
even enter 'select'.

There is no atomic 'release mutex and select' function so you can never
know for sure whether the 'close' will occur before or after the other
thread enters 'select'. There's also the possibility that another thread
will open a new connection onto the same descriptor before the thread
blocked in 'select' gets a chance to notice that the descriptor is being
closed.

It's also not clear what the 'close' does in this case. An attempt to close
a descriptor is not supposed to close the underlying connection unless it
closes the last reference. It's not clear whether 'select' represents a
reference or not, and it's not clear what should happen if the descriptor
table changes before the 'select' thread gets woken up even if the 'close'
call schedules it.

One can argue that 'select' should return because a 'read' or 'write' to
the connection wouldn't block. But that's only true after 'select' returns.
While the endpoint is in use (and 'select' is using it), 'close' shouldn't
necessarily close the underlying connection. So this argument require
bootstrapping.

For TCP, use 'shutdown'. Don't 'close' the descriptor until you are sure no
thread is using it. This is as serious an error as 'free'ing memory that
another thread is using.

So your code is buggy. So long as the kernel doesn't lose track of the
resources entirely, it's behavior is (at least to me) acceptable. In fact, I
wish it would punish errors like this more severely, as this would reduce
the amount of code out there that has them.

DS

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/