Re: Examine route.c! -- two bugs fixed -- patch attached!

Wolfgang Walter (wolfgang.walter@stusta.mhn.de)
Thu, 4 Dec 1997 21:41:38 +0100


On Thu, Dec 04, 1997 at 09:54:59PM +0100, Ingo Molnar wrote:
>
> On Thu, 4 Dec 1997, Alan Cox wrote:
>
> > > fine. One thing i was unable to verify, when we destroy a socket, do we
> > > always 'rt_put()' it's sk->ip_cached_route? [but this should make no
> > > difference as the activity done by the 'nc' script does not create as
> > > many sockets as there are leaks].
> >
> > It might be worth putting in code to check we always do this (make the rt_put
> > of sk->ip_cached_route an inline and NULL the pointer). Also to trap
> > any cases where we do something like retransmit a tcp frame and perhaps relock
> > the route we already had ?
>
> i've checked the latter one (via codereading that is), and it looks like
> (with Philip's patch) there is no ipv4/*.c code that has such problems. In
> all cases where sk->rt_cached is overwritten, we first drop the old route
> properly. Initialization is fine too. And we never seem to look up a temp
> route without dropping it afterwards.
>
> What i havent yet checked is:
>
> - wether there is race window in the route cache itself (all the
> complex logic with the route backlog bh and stuff, i have looked
> into it, but no full coverage yet).
>
> i've sent this small route.c patch to Daniel that made every new route
> RTF_NOTCACHED:
>
> #if 0
> if (dev != NULL)
> #endif
> rth->rt_flags |= RTF_NOTCACHED;
> #if 0
> else
> rt_cache_add(hash, rth);
> #endif
>
> but it made no difference, still leak. Does this mean the route cache is
> innocent?
>

Hmm, all code calling ip_rt_put will use the inline function as net/route.h is
included and that version does not call rt_free() for RTF_NOTCACHED cases.
Maybe it is worth just to see what happens if the inline function in net/route.h
is changed as the version in route.c by adding

/* If this rtable entry is not in the cache, we'd better free it once the
* refcnt goes to zero, because nobody else will... */
if ( rt && (rt->rt_flags & RTF_NOTCACHED) && (!rt->rt_refcnt) )
rt_free(rt);

I just wonder why that has been added from 2.0.30 to 2.0.31.

I'm not very experienced with this code, but I think:

rt_kick_free_queue is the bh and frees the elemensts of the
rt_free_queue list

Only rt_free adds elements to that list

Now rt_free ist called in ip_rt_check_expire, but only for those
which are cached. The same holds for rt_cache_flush and
rt_garbage_collect_1.

The call in rt_cache_add only is for error-handling and this is true
for ip_rt_slow_route, too.

So I assume who ever supplied that for 2.0.32 was right, he just
missed the inline function.

What is interesting is that from 2.0.30 to 2.0.31 ip_rt_slow_route() has been
changed:

if (ip_rt_lock == 1)
- rt_cache_add(hash, rth);
+ {
+ /* Don't add this to the rt_cache if a device was specified,
+ * because we might have skipped better routes which didn't
+ * point at the right device. */
+ if (dev != NULL)
+ rth->rt_flags |= RTF_NOTCACHED;
+ else
+ rt_cache_add(hash, rth);
+ }

In 2.0.30 there seems to be no uncached routes... Therefor I assume that both
changes are related and made by the same person. As I don't have the
pre-patches for 2.0.31 any more, I cannot say when this has been added.

If this is a leak it should happen if ip_rt_slow_route is called with the third
argument != NULL. This is the case if ip_rt_route is called with the
third argument != 0 and the entry is not found in the cache. ip_rt_route
is again called from ip_check_route with third argument != NULL if it is
itself called with third argument != NULL.

If we assume not to use sockets bound to a certain device, I think this is only
done by ip_build_header in ip_output.c if a device is supplied as argument.
And this seems to be done under certain circumstances in tcp_send_reset (if
dev = &loopback_dev;
tmp = prot->build_header(buff,sk->saddr, sk->daddr, &dev,
IPPROTO_TCP, sk->opt,
sizeof(struct tcphdr),sk->ip_tos,sk->ip_ttl,&sk->ip_route_cache);
gets executed) and maybe tcp_write_wakeup, further maybe in do_tcp_sendmsg.

Seems to me that RTF_NOTCACHED-feature was added for binding a device to a
socket and has some sideeffects on tcp and cacheing routes. But I don't know
if this can lead to a such memory loss.

By the way, ee are running here a lot of routers with 2.0.32, some of them
with firewall, all of them are uptime now for almost 15-16 days.
One of this router almost constantly forwards about 500 MBytes/s.
What is specific here is that all routes are static and have not been changed
since booting.

> [i might have missed something]
>
>
> Wolfgang Walter's comment, the thing about route.h <=> route.c, it does
> look suspicious, since we do not make that module/nomodule distinction in
> ip_rt_route(). (but i guess no current module uses this function so this
> isnt a problem?)
>
> -- mingo
>

Wolfgang Walter