Re: Horrible drive performance under concurrent i/o jobs (dlh problem?)

Andrew Morton (akpm@digeo.com)
Wed, 18 Dec 2002 15:46:57 -0800


Torben Frey wrote:
>
> Hi Andrew, hi Con,
>
> > Here's a diff against base 2.4.20. It may be a little out of date
> > wrt Andrea's latest but it should tell us if we're looking in the
> > right place.
> Ok, I did not run the complete 2.4.20aa1 kernel yet since I am not sure
> if it is intended to be used, but I applied your patch, Andrew (thanks
> for mailing it). It still does not fix the problem. One job doing much
> I/O starts with about 80% CPU but then drops down to about 30% in the
> first 40 seconds. Load goes from 0.00 to 2.4 within that time.
>
> And I can see bdflush and my process marked with "D" in the process list.
>
> Catting the device to /dev/null only made it worse :-(
>
> Creating a 1GB file using dd takes about 1 minute compared to 16 seconds
> without other jobs running.

err, now hang on.

I thought you said that this simple dd sometimes takes 14 seconds,
and sometimes takes 4 minutes.

Please describe _exactly_ what activity is happening, and against
what disks when this problem exhibits. The "other jobs".

> Do you think it could be a ReiserFS problem on a RAID?

Doubtful. As far as reiserfs (and the block layer) is concerned,
your raid array is just a single disk (is this correct??)

> Do you know of anything else I could try?

Try a different filesystem

Try a dd to the blockdevice itself (or cat /dev/zero > /dev/sda1)

Run `vmstat 1' and send us the output which corresponds to
the poor throughput.

Try a different RAID mode.

Pull some disks out.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/