Re: [rfc] Near-constant time directory index for Ext2

Martin Mares (mj@suse.cz)
Wed, 21 Feb 2001 23:50:08 +0100


Hello!

> You're right. However, for each hash table operation to be O(1) the size
> of the hash table must be >> n.

If we are talking about average case complexity (which is the only possibility
with fixed hash function and arbitrary input keys), it suffices to have
hash table size >= c*n for some constant c which gives O(1/c) cost of
all operations.

> I suggested at one point to use B-trees with a hash value as the key.
> B-trees are extremely efficient when used on a small constant-size key.

Although from asymptotic complexity standpoint hashing is much better
than B-trees, I'm not sure at all what will give the best performance for
reasonable directory sizes. Maybe the B-trees are really the better
alternative as they are updated dynamically and the costs of successive
operations are similar as opposed to hashing which is occassionally very
slow due to rehashing unless you try to rehash on-line, but I don't
know any algorithm for on-line rehashing with both inserts and deletes
which wouldn't be awfully complex and slow (speaking of multiplicative
constants, of course -- it's still O(1) per operation, but "the big Oh
is really big there").

Have a nice fortnight

-- 
Martin `MJ' Mares <mj@ucw.cz> <mj@suse.cz> http://atrey.karlin.mff.cuni.cz/~mj/
"#define QUESTION ((bb) || !(bb))"  -- Shakespeare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/