Re: R: Kernel bug handling TCP_RTO_MAX?

Matti Aarnio (matti.aarnio@zmailer.org)
Fri, 13 Dec 2002 05:39:28 +0200


On Thu, Dec 12, 2002 at 06:26:45PM -0800, Nivedita Singhvi wrote:
> Alan Cox wrote:
> > The default is too short ?
>
> Short?? :). On the contrary...
>
> here's what it would roughly look like:
>
> assuming HZ = 100 (2.4)
>
> tcp_retries2 = 15 (default) /* The # of retransmits */
>
> TCP_RTO_MAX = 120*HZ = 120 seconds = 120000ms
> TCP_RTO_MAX2 = 6*HZ = 6 seconds = 6000 ms /* modified value */
>
> TCP_RTO_MIN = HZ/5 = 200ms
>
> Assuming you are on a local lan, your round trip
> times are going to be much less than 200 ms, and
> so using the TCP_RTO_MIN of 200ms ("The algorithm
> ensures that the rto cant go below that").

The RTO steps in only when there is a need to RETRANSMIT.
For that reason, it makes no sense to place its start
any shorter.

> At each retransmit, TCP backs off exponentially:
>
> Retransmission # Default rto (ms) With TCP_RTO_MAX(2) (ms)
> 1 200 200
...
> 14 120000 6000
> 15 120000 6000
>
> Total time = 804.6 seconds 66.2 seconds
> 13.4 minutes
>
> So the minimum total time to time out a tcp connection
> (barring application close) would be ~13 minutes in the
> default case and 66 seconds with a modified TCP_RTO_MAX
> of 6*HZ.

You can have this by doing carefull non-blocking socket
coding, and protocol traffic monitoring along with
protocol level keepalive ping-pong packets to have
something flying around (like NJE ping-pong, not
that every IBM person knows what that is/was..)

> I can see the argument for lowering both, the TCP_RTO_MAX
> and the TCP_RTO_MIN default values.

I don't.

> I just did a bunch of testing over satellite, and round trip
> times were of the order of 850ms ~ 4000ms.
>
> The max retransmission timeout of 120 seconds is two orders of
> magnitude larger than really the slowest round trip times
> probably experienced on this planet..(Are we trying to make this
> work to the moon and back? Surely NASA has its own code??)

We try not to kill overloaded network routers while they
are trying to compensate some line breakage and doing
large-scale network topology re-routing.

> Particularly since we also retransmit 15 times, cant we conclude
> "Its dead, Jim" earlier??

No. I have had LAN spanning-tree flaps taking 60 seconds
(actually a bit over 30 seconds), and years ago Linux's
TCP code timed out in that. It was most annoying to
use some remote system thru such a network...

> 200ms is for the minimum retransmission timeout is roughly a
> thousand times, if not more, the round trip time on a
> fast lan. Since the algorithm is adaptive (a function of the
> measured round trip times), what would be the negative
> repercussions of lowering this?

When things _fail_ in the lan, what would be sensible value ?
How long will such abnormality last ?

In overload, resending quickly won't help a bit, just raise
the backoff (and prolong overload.)

Loosing a packet sometimes, and that way needing to retransmit
is the gray area I can't define quickly. If it is rare, it
really does not matter. If it happens often, there could be
so serious trouble that having quicker retransmit will only
aggreviate the trouble more.

> It may not be a good idea to make either tunable, but what about
> the default init rto value, TCP_TIMEOUT_INIT, since that would allow a
> starting point of something close to a suitable value?
>
> The problem with all of the above is that the TCP engine is
> global and undifferentiated, and tuning for at least these parameters
> is the same regardless of the interface or route or environment..

You are looking for "STP" perhaps ?
It has a feature of waking all streams retransmits, in between
particular machines, when at least one STP frame travels in between
the hosts.

I can't find it now from my RFC collection. Odd at that..
Neither as a draft. has it been abandoned ?

> Yes, we should and want to meet the standards for the internet, and
> behave in a network friendly fashion. But all networks != internet.
>
> I'm thinking for eg of a dedicated fast gigabit or better connection
> between a tier 2 webserver and a backend database, for example, that
> has every need of performance and few of standards compliance..
>
> It would be wonderful if we could tune TCP on a per-interface or a
> per-route basis (everything public, for a start, considered the
> internet, and non-routable networks (10, etc), could be configured
> suitably for its environment. (TCP over private LAN - rfc?). Trusting
> users would be a big issue..
>
> Any thoughts? How stupid is this? Old hat??

More and more of STP ..

> thanks,
> Nivedita

/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/