You're comparing apples to oranges.
Clearly you're not going to make _one_ load to get fls, since having a
4GB lookup array for a 32-bit fls would be "somewhat" wasteful.
So the lookup table would probably look up just the last 8 bits.
So the lookup table version is several instructions in itself, doing about
half of what the calculating version needs to do _anyway_. Including those
data-dependent branches.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/