Re: [reiserfs-dev] Re: Ext2 directory index: ALS paper and benchmarks

Ragnar Kjørstad (reiserfs@ragnark.vestdata.no)
Sat, 8 Dec 2001 20:16:39 +0100


On Sat, Dec 08, 2001 at 03:15:50AM +0300, Hans Reiser wrote:
> >>no, actually this is a problem for v3. stat data are time of creation
> >>ordered (very roughly speaking)
> >>and directory entries are hash ordered, meaning that ls -l suffers a
> >>major performance penalty.
> >
> >Yes, just remember that file-body ordering also has the same problem.
> >(ref the "find . -type f | xargs cat > /dev/null" test wich I think
> >represent maildir performance pretty closely)
>
> So is this a deeply inherent drawback of offering readdir name orders
> that differ hugely from time of creation order?

It should not be, if:
* If the cache was smart enough to detect that the user is reading all
the files in a directory and started reading in files into memory
ahead of time. It sounds ugly, so I don't know if I like it.
or
* file-bodies were ordered by hash as well.

> I suppose we could use objectids based on the hash of the first
> assigned filename plus a 60 bit global to the FS counter....
>
> but it is too many bits I think. I think that using substantially less
> than the full hash of the name that is used for directory entry keys
> doesn't work.... Comments welcome.

The abould stort the file-bodies in optimal order in the three, but
block-allocation is a seperate issue, right? Even if block-allocation
would take objectids into account it would be nearly impossible to get
the optimal order of the file-bodies, because you don't know the number
of files and their sizes before allocating the blocks for the first
files. (unless you would move files around after they are allocated)

So, I think the _only_ way to get the optimal performance for a growing
directory is to do allocation and ordering by creating-time.

That said, maybe trying to find algorithms that are order sub-sets of
the directories entries in optimal order rather than the whole directory
is perhaps more constructive? Or look at repackers or other utilities to
reorder data in batch instead?

So how is this done in ext2/3 with directory indexes, Daniel? Are there
any papers available, or would I have to decifer the source?

-- 
Ragnar Kjørstad
Big Storage
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/