Re: Fw: memory corruption in tcp bind hash buckets on SMP?

kuznet@ms2.inr.ac.ru
Fri, 1 Mar 2002 22:07:03 +0300 (MSK)


Hello!

> > > inverted order of insertion into the lists in tw_hashdance() is probably
> > > cleaner fix than inverted order of removal..
> >
> > Why are they not equivalent? Good question? :-)
>
> they are. with "if (!tb->tb) return;" instead of "if (!tb->pprev) return;".
> silly me was thinking of literal cut-n-paste invertion of the oder :->

However, you were 100% right. They are really not equivalent. :-)

Right solution is to exchange order of insertion. tcp_timewait_kill()
is right and need not changes.

Proof follows:

Main invariant: that guy who inserts/removes socket to/from established
hash table, must make this for binding table. That guy who did not find
socket in established hash, must not touch binding table.

In this case concurrent tcp_timewait_kill()s are happy: socket will
be removed from binding table once and only once, when it is removed
from established hash. The second tcp_timewait_kill() does not find
socket in established table and it must _not_ touch binding table,
despite of socket can be there at the moment.

The mess happens while concurrent remove and insert: insert adds
socket to established table and then to binding table. Racing remove
removes it from established table, but cannot satisfy invariant
because socket is still not in binding table. (This place should
be asserted with a BUG() for future)

The second statement, which completes the proof: removing is possible
only after the socket is added to established hash table (it is evident,
until this time bucket is private to creator).

Alexey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/