IPV4 socket layer, was: nfs problem: aix-server --- linux 2.4.15pre5 client

Birger Lammering (b.lammering@science-computing.de)
Mon, 19 Nov 2001 11:20:23 +0100


Hi,

there seems to be a problem with (at least) Linux 2.4.13, 2.4.15pre3
and network connections to AIX.

Trond thinks, it's not in the nfs3 bit and I found the problem with
these driver/nic combinations:
1. e100 driver (from Intel) on a EtherExpress Pro 100
2. kernel-3c59x on a 3Com 905C
3. recent Donald Becker-3c59x on a 3Com 905C.

on 10Mbit/HalfDuplex and 100MBit/FullDuplex

So it seems to be somewhere in between (IPV4 socket layer???).

Here is what happens:

When copying a file of >900kb into a nfs3-exported Directory of an AIX
nfs3-Server we get on the second attempt:

nfs: server camc1083 not responding, still trying

and after a while:

nfs: server camc1083 OK

The tcpdump during a copy looks like this:
tcpdump on AIX (caes04):
13:47:28.337776317 truncated-ip - 18 bytes missing!capc25.muc.796 > caes04.muc.shilp: P 2059179904:2059180060(156) ack 4022052897 win 17520 (DF)
13:47:28.337860266 caes04.muc.shilp > capc25.muc.796: P 1:121(120) ack 156 win 60032
13:47:28.343619224 capc25.muc.796 > caes04.muc.shilp: . ack 121 win 17520 (DF)
13:47:28.344042473 truncated-ip - 50 bytes missing!capc25.muc.796 > caes04.muc.shilp: P 156:344(188) ack 121 win 17520 (DF)
13:47:28.357398139 truncated-ip - 138 bytes missing!caes04.muc.shilp > capc25.muc.796: P 121:397(276) ack 344 win 60032
13:47:28.364496982 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 344:1804(1460) ack 397 win 17520 (DF)
13:47:28.364872917 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 1804:3264(1460) ack 397 win 17520 (DF)
13:47:28.365142046 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 3264:4724(1460) ack 397 win 17520 (DF)
13:47:28.482136300 caes04.muc.shilp > capc25.muc.796: . ack 4724 win 55652
13:47:28.489517784 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 4724:6184(1460) ack 397 win 17520 (DF)
13:47:28.490018916 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 6184:7644(1460) ack 397 win 17520 (DF)
13:47:28.490313868 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 7644:9104(1460) ack 397 win 17520 (DF)
13:47:28.490566573 truncated-ip - 1322 bytes missing!capc25.muc.796 > caes04.muc.shilp: . 9104:10564(1460) ack 397 win 17520 (DF)
13:47:28.685246999 caes04.muc.shilp > capc25.muc.796: . ack 2059190468 win 49812

and on Linux (capc25):
13:47:28.324042 > capc25.muc.796 > caes04.muc.nfs: P 2059179904:2059180060(156) ack 4022052897 win 17520 (DF)
13:47:28.330599 < caes04.muc.nfs > capc25.muc.796: P 1:121(120) ack 156 win 60032
13:47:28.330620 > capc25.muc.796 > caes04.muc.nfs: . 156:156(0) ack 121 win 17520 (DF)
13:47:28.330857 > capc25.muc.796 > caes04.muc.nfs: P 156:344(188) ack 121 win 17520 (DF)
13:47:28.350291 < caes04.muc.nfs > capc25.muc.796: P 121:397(276) ack 344 win 60032
13:47:28.350556 > capc25.muc.796 > caes04.muc.nfs: . 344:1804(1460) ack 397 win 17520 (DF)
13:47:28.350569 > capc25.muc.796 > caes04.muc.nfs: . 1804:3264(1460) ack 397 win 17520 (DF)
13:47:28.350581 > capc25.muc.796 > caes04.muc.nfs: . 3264:4724(1460) ack 397 win 17520 (DF)
13:47:28.475691 < caes04.muc.nfs > capc25.muc.796: . 397:397(0) ack 4724 win 55652
13:47:28.475724 > capc25.muc.796 > caes04.muc.nfs: . 4724:6184(1460) ack 397 win 17520 (DF)
13:47:28.475734 > capc25.muc.796 > caes04.muc.nfs: . 6184:7644(1460) ack 397 win 17520 (DF)
13:47:28.475743 > capc25.muc.796 > caes04.muc.nfs: . 7644:9104(1460) ack 397 win 17520 (DF)
13:47:28.475752 > capc25.muc.796 > caes04.muc.nfs: . 9104:10564(1460) ack 397 win 17520 (DF)
13:47:28.678570 < caes04.muc.nfs > capc25.muc.796: . 397:397(0) ack 10564 win 49812
13:47:28.678604 > capc25.muc.796 > caes04.muc.nfs: . 10564:12024(1460) ack 397 win 17520 (DF)
13:47:28.678614 > capc25.muc.796 > caes04.muc.nfs: . 12024:13484(1460) ack 397 win 17520 (DF)
13:47:28.678623 > capc25.muc.796 > caes04.muc.nfs: . 13484:14944(1460) ack 397 win 17520 (DF)
13:47:28.678632 > capc25.muc.796 > caes04.muc.nfs: . 14944:16404(1460) ack 397 win 17520 (DF)
13:47:28.678642 > capc25.muc.796 > caes04.muc.nfs: . 16404:17864(1460) ack 397 win 17520 (DF)
13:47:28.884628 < caes04.muc.nfs > capc25.muc.796: . 397:397(0) ack 17864 win 42512

I have no clue on how to debug a IPV4 socket layer, so I'm grateful
for any comment on how to provide better information...

Regards,
Birger

Trond Myklebust writes:
> Odd... That dump seems to indicate a problem with TCP in which the
> server is seeing bad packets. That's not an NFS problem per se, as we
> just use the standard socket functions. IOW it's either a bug in the
> IPV4 socket layer or a bug with your network card driver.
>
> Have you tried changing the networking card and/or its driver?
>
> Cheers,
> Trond
>

ps: Trond: the email I sent earlier was written after too little
testing...

--
Dr. Birger Lammering                         
science+computing ag                        fon: 089 356386-15
Geschaeftsstelle Muenchen                   fax: 089 356386-37
Ingolstaedter Str. 22
D-80807 Muenchen         mailto:B.Lammering@science-computing.de

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/