Aha, and duh. At 1 million iterations, my binary search clobbers the shift
version by a factor of four. At 2**31 iterations, my version also wins, by
20%.
I should note that the iterations parameter to my benchmark program is deeply
flawed, since it changes the nature of the input set, not just the number of
iterations. Fortunately, all tests I've done have been with 2**32
iterations, giving a perfectly uniform distribution of input values.
For uniformly distributed inputs the simple shift loop does well, but for
calculating floor(log2(x)) it's much worse than the fancier alternatives. I
suspect typical usage leans more to the latter than the former.
Regards,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/