> > > the reference count is exactly one. This is in the most cases correct.
> > > If Philip found out all possible drops this may be a reason.
> >
> > Im beginning to suspect we have a kernel/interrupt context problem here and
> > the reference counters arent safe
>
> Scratch that it seems.
I've verified ipv4/*.c, wether we accidentally drop/overwrite a route
somewhere, but apart from Daniel's fixes everything seemed to be just
fine. One thing i was unable to verify, when we destroy a socket, do we
always 'rt_put()' it's sk->ip_cached_route? [but this should make no
difference as the activity done by the 'nc' script does not create as
many sockets as there are leaks].
What i suspect is a nontrivial race somewhere in the tcp+ip send/receive
path, especially when adding cached routes. Since there are no crashes, it
must be a 'clean' race, ie. some list element gets lost when adding a new
element to the route hash, or similar?
-- mingo