fsck, raid reconstruction & bad bad 2.4.3

Linas Vepstas (linas@backlot.linas.org)
Sun, 15 Apr 2001 13:18:25 -0500 (CDT)


Hi,
I want to report a trio of raid-related problems. The third one is
very serious, and effectively prevents 2.4.3 from being usable (by me).

First problem: In kernel-2.4.2 and earlier, if the machine is not cleanly
shut down, then upon reboot, RAID reconstruction is automatically started.
(For RAID-1, this more-or-less ammounts to copying the entire contents
of one disk partition on one disk to another). The reconstruction
code seems to be clever: it will try to use the full bandwidth when
the system is idle, and it will throttle back when busy. It will
only throttle back so far: it tries to maintain at least a minimum amount
of work going, in order to gaurentee forward progress even on a busy system.

The problem: this dramatically slows fsck after an unclean shut-down.
You can hear the drives machine-gunning. I haven't stop-watch timed it,
but its on the order of 5x slower to fsck a raid partition when there's
reconstruction going on, then when the raid thinks its clean. This
makes unclean reboots quite painful.

(There is no config file to disable/alter this .. no work-around that I
know of ..)

--------
The second problem: oparallelizing fsck doesn't realize that different
/dev/md raid volumes are on the same physical disks, and thus tries
to parallelize .... again slowing things down. There is a work-around,
modify /etc/fstab to set the rder of fsck's. However, I doubt the HOWTO
really gets into this .... it would be nice to get fsck to 'do the
right thing'.

----------

Third problem:

I just tried boot 2.4.3 today. (after an unclean shutdown) fsck runs
at a crawl on my RAID-1 volume. It would take all day (!! literally)
to fsck. The disk-drive activity light flashes about once a second,
maybe once every two seconds. (with a corresponding click from the
drive).

On 2.4.2 kernels, the disk activity light is constantly on... and the
fsck proceeds apace.

Whatever it is that changed in 2.4.3, it makes unclean reboots
impossible ...

--linas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/