2.2.19 + reiserfs 3.5.32 nfsd wait_on_buffer/down_failed

Michael Stiller (michael@ping.de)
Tue, 8 May 2001 16:42:43 +0200


Hi,

we run a nfs server utilizing 2.2.19 + ReiserFS version 3.5.32 on a
P 3 550 machine. Disk subsystem is a GDT7518RN using 4 UW disks as raid 5
device. After upgrading from 2.2.17 + reiserfs to 2.2.19 we experience
many (very much more than with 2.2.17) problems with our nfs clients
about 12 (linux). Network ist 100Mbit full duplex / switched.
I do not think this is network related, cause ping -f doesnt show any
packet loss.

During not so heavy IO on the exported fs
one nfsd thread seems to be waiting for the disk:

621 root 1 0 0 0 wait_on_b DW 6.2 0.0 1:49 nfsd

and the other threads are waiting in down_fail:

610 root 0 0 0 0 down_fail DW 0.0 0.0 1:52 nfsd
611 root 0 0 0 0 down_fail DW 0.0 0.0 1:40 nfsd
612 root 0 0 0 0 down_fail DW 0.0 0.0 1:41 nfsd
613 root 0 0 0 0 down_fail DW 0.0 0.0 1:48 nfsd
614 root 0 0 0 0 down_fail DW 0.0 0.0 1:45 nfsd
615 root 0 0 0 0 down_fail DW 0.0 0.0 1:43 nfsd
616 root 0 0 0 0 down_fail DW 0.0 0.0 1:50 nfsd
617 root 0 0 0 0 down_fail DW 0.0 0.0 1:42 nfsd
618 root 0 0 0 0 down_fail DW 0.0 0.0 1:44 nfsd
619 root 0 0 0 0 down_fail DW 0.0 0.0 1:42 nfsd
620 root 0 0 0 0 down_fail DW 0.0 0.0 1:47 nfsd
622 root 0 0 0 0 down_fail DW 0.0 0.0 1:47 nfsd
623 root 0 0 0 0 down_fail DW 0.0 0.0 1:43 nfsd
624 root 0 0 0 0 down_fail DW 0.0 0.0 1:48 nfsd
609 root 0 0 0 0 down_fail DW 0.0 0.0 1:50 nfsd

During this event:

- If i check the disk io with e.g. vmstat 1 the machine is doing about 200 bi
per second, which is not so much i guess.
- the client machines hang, should be clear:

nfs: server foo is not responding
nfs: server foo still not responding
nfs: server foo OK

Our idea is to revert back to 2.2.17 cause the behaviour was much better.

How can i debug this ? Can i do some tuning ?
Should i revert to some older kernel.
Are there any patches for this problem ? Does anyone has the same or
related problem ?

Any pointer would be useful.

TIA and cheers,

-Michael

-- 
In a world where an admin is rendered useless when the ball in his mouse
has been taken out, its good to know that I know UNIX. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/