Re: Common hash table implementation

Daniel Phillips (phillips@bonn-fries.net)
Sun, 22 Jul 2001 00:57:05 +0200


On Saturday 21 July 2001 00:32, Brian J. Watson wrote:
> Andi Kleen wrote:
> > hash tables like the inode hash are twice as big as needed because
> > each bucket is two pointers instead of one]
>
> I agree. Hash tables such as inode_hashtable and dentry_hashtable are
> half as efficient under stress as they would otherwise be, because
> they use an array of list_heads.

Whoops, yes. The buffer_head pprev strategy gives both the
double linked list and a table vector of single links, at the expense
of a few extra tests.

For a one-size-fits-all hash strategy the question is, double-linked of
singled-linked? The advantage of a double linked list for a hash is
quick, generic deletion. Using the pprev strategy the disadvantage of
double links in the table vector can be eliminated. The main advantage
of the single link is space savings both in the table and in the
structures. The disadvantage is having to do an extra bucket search on
deletion, though this is partially offset by fewer pointer operations
to perform insertion or deletion.

It's hard to say which is fastest. It might turn out that the single
linked strategy is actually faster, it really depends on typical depth
of the buckets. A double-linked list deletion needs to follow two
pointers and set two links, the single-linked list only one of each.
There's a similar difference for insertion. So the extra overhead of
insertion and deletion can be set off against say, three or four
iterations of the bucket search loop. If buckets average six to eight
elements deep, it's a tie.

A more subtle cost of the double-link approach is the slight extra
cache pressure due to the enlarged objects. This cost is incurred
continously as the objects are used, it might very well add up to more
than the cost of the final list traversal for the delete in the
single-linked case.

How about implementing both strategies, using your generic interface to
make them look the same? And then seeing which is fastest in practice.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/