Re: 3x slower file reading oddity

Andrew Morton (akpm@zip.com.au)
Mon, 17 Jun 2002 17:36:12 -0700


dean gaudet wrote:
>
> ...
> > You'll get best throughput with a single read thread.
>
> what if you have a disk array with lots of spindles? it seems at some
> point that you need to give the array or some lower level driver a lot of
> i/os to choose from so that it can get better parallelism out of the
> hardware.

mm. For that particular test, you'd get nice speedups from striping
the blockgroups across disks, so each `cat' is probably talking to
a different disk. I don't think I've seen anything like that proposed
though.

But regardless of the disk topology, the sanest way to get good IO
scheduling is to throw a lot of requests at the block layer. That's
simple for writes. But for reads, it's harder.

You could fork one `cat' per file ;) (Not so silly, really. But if
you took this approach, you'd need "many" more threads than blockgroups).

Or teach `cat' to perform asynchronous (aio) reads. You'd need async
opens, too. But generally we get a good cache hit rate against the
data which is needed to open a small file.

hmm. What else? Physical readahead - read metadata into the block
device's pagecache and flip pages from there into directories and
files on-demand. Fat chance of that happening.

Or change ext2/3 to not place directories in different block groups
at all. That's super-effective, but does cause somewhat worse long-term
fragmentation.

You can probably lessen the seek-rate by accessing the files in the correct
order. Read all the files from a directory before descending into any of
its subdirectories. Can find(1) do that? You should be able to pretty
much achieve disk bandwidth this way - it depends on how bad the inter-
and intra-file fragmentation has become.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/