Re: [PATCH] Define hash_mem in lib/hash.c to apply hash_long to an arbitraty piece of memory.

Neil Brown (neilb@cse.unsw.edu.au)
Tue, 7 Jan 2003 17:07:06 +1100


On Monday January 6, aaronl@vitelus.com wrote:
> On Tue, Jan 07, 2003 at 04:03:28PM +1100, Neil Brown wrote:
> > I did a little testing and found that on a list of 2 million
> > basenames from a recent backup index (800,000 unique):
> >
> > hash_mem (as included here) is noticably faster than HASH_HALF_MD4 or
> > HASH_TEA:
> >
> > hash_mem: 10 seconds
> > DX_HASH_HALF_MD4: 14 seconds
> > DX_HASH_TEA: 15.2 seconds
>
> I'm curious how the hash at
> http://www.burtleburtle.net/bob/hash/doobs.html would fare. He has a
> 64-bit version at
> http://www.burtleburtle.net/bob/c/lookup8.c.

Performing the same tests: producing 8 bit hashes from 800,000
filenames.

Speed is 10 seconds, comarable to hash_mem

normalised standard deviation of frequencies is 0.0171039
which is is the same ball park as the hashes ext3 uses
(they gave 0.0169 and 0.0182. hash_mem gave 0.02255).

So (on this set of values at least) it does seem to be a better hash
function.

I might look more closely at it.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/