Re: [RFC][PATCH] Faster generic_fls

Daniel Phillips (dphillips@sistina.com)
Fri, 2 May 2003 02:43:15 +0200


On Friday 02 May 2003 02:10, Willy Tarreau wrote:
> At first, I thought you had coded an FFS instead of an FLS. But I realized
> it's valid, and is fast because one half of the numbers tested will not even
> take one iteration.

Aha, and duh. At 1 million iterations, my binary search clobbers the shift
version by a factor of four. At 2**31 iterations, my version also wins, by
20%.

I should note that the iterations parameter to my benchmark program is deeply
flawed, since it changes the nature of the input set, not just the number of
iterations. Fortunately, all tests I've done have been with 2**32
iterations, giving a perfectly uniform distribution of input values.

For uniformly distributed inputs the simple shift loop does well, but for
calculating floor(log2(x)) it's much worse than the fancier alternatives. I
suspect typical usage leans more to the latter than the former.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/