Re: 3x slower file reading oddity

dean gaudet (dean-list-linux-kernel@arctic.org)
Mon, 17 Jun 2002 19:08:13 -0700 (PDT)


On Mon, 17 Jun 2002, Andreas Dilger wrote:

> On Jun 17, 2002 17:36 -0700, Andrew Morton wrote:
> > You can probably lessen the seek-rate by accessing the files in the correct
> > order. Read all the files from a directory before descending into any of
> > its subdirectories. Can find(1) do that? You should be able to pretty
> > much achieve disk bandwidth this way - it depends on how bad the inter-
> > and intra-file fragmentation has become.
>
> Just FYI - "find -depth" will do that, from find(1):
> -depth Process each directory's contents before the directory itself.

not quite... even with this, find processes a directory's contents in the
order that the filenames appear on disk ... and this will recurse into
subdirs before finishing with all the files.

what i think andrew was suggesting is to process all non-directory entries
before handling the directory entries.

i'm working with something like this:

find -type f -print \
| perl -ne '@c = split(/\//); print $#c . " $_";' \
| sort -k 1,1nr -k 2,2 \
| perl -pe 's#^\d+ ##;' \
> filelist

(deliberately not using perl's sort because i want the thing to scale of
course ;)

that reverse sorts by number of path components first, pathname second.

it seems to be worth another 10% or so improvement on the single reader
test.

i figured handling the leaves first was best... hard to tell if there's
any difference if i remove the "r". (this is on a live system with other
loads perturbing the results unfortunately.)

-dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/