So, I caught it. We had modified the stack to separate half-open and
established but unaccepted connections on different queues (we also
added a BSD like hashing scheme to half-open connections queue
so that we dont have to do a linear search).
The RST problem is being caused by our changes. So, looks like
you guys did the same splitting (of half-open and established but
unaccepted connections) in 2.3 and thats why you said that you
saw the problem in 2.3.
Did the splitup bringup other races in the code ?
(By the way, we dont see this problem in the field .. I guess
we will seee it only on heavily loaded servers where the clients
are on low-latency links !)
Anyway, thanks a lot for your comments and help.
(adding the code to do a established hash queue lookup
in backlog processing fixes the problem !)
Rajeev
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/