Re: [rfc] Near-constant time directory index for Ext2

Daniel Phillips (phillips@innominate.de)
Tue, 20 Feb 2001 22:41:56 +0100


On Tue, 20 Feb 2001, Linus Torvalds wrote:
> In article <01022020011905.18944@gimli>,
> Daniel Phillips <phillips@innominate.de> wrote:
> >Earlier this month a runaway installation script decided to mail all its
> >problems to root. After a couple of hours the script aborted, having
> >created 65535 entries in Postfix's maildrop directory. Removing those
> >files took an awfully long time. The problem is that Ext2 does each
> >directory access using a simple, linear search though the entire
> >directory file, resulting in n**2 behaviour to create/delete n files.
> >It's about time we fixed that.
>
> Interesting.
>
> However, if you're playing with the directory structure, please consider
> getting rid of the "struct buffer_head"-centricity, and using the page
> cache instead. The page cache has much nicer caching semantics, and
> looking up data in the page cache is much faster because it never needs
> to do the "virtual->physical" translation.

Oh yes, I was planning on it. I started with the buffers version
for two main reasons version: 1) it's simple and solid and 2) it
provides the basis for a backport to 2.2 - after the 2.4/2.5 version is
complete of course.

> Talk to Al Viro about this - he's already posted patches to move the
> regular ext2 directory tree into the page cache, and they weren't
> applied to 2.4.x only because there was no great feeling of "we _must_
> do this for correctness".
>
> I see that you already considered this issue, but I wanted to bring it
> up again simply because something like this certainly looks like a
> potential candidate for 2.5.x, but I will _refuse_ to add code that
> increases our reliance of "struct buffer_head" as a caching entity. So
> I'd rather see the page cache conversion happen sooner rather than
> later...

You are preaching to the converted.

> Also, just out of interest: if you've already been worrying about
> hashes, what's the verdict on just using the native dentry hash value
> directly? It has other constraints (_really_ low latency and absolutely
> performance critical to calculate for the common case, which is not
> needing a real lookup at all), but maybe it is good enough? And if not,
> and you have done some statistics on it, I'd love to hear about it ;)

You mean full_name_hash? I will un-static it and try it. I should have
some statistics tomorrow. I have a couple of simple metrics for
measuring the effectiveness of the hash function: the uniformity of
the hash space splitting (which in turn affects the average fullness
of directory leaves) and speed.

Let the hash races begin.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/