Re: [RFC][PATCH] Faster generic_fls

Linus Torvalds (torvalds@transmeta.com)
Wed, 30 Apr 2003 07:11:47 -0700 (PDT)


On 30 Apr 2003, Falk Hueffner wrote:
>
> gcc 3.4 will have a __builtin_ctz function which can be used for this.
> It will emit special instructions on CPUs that support it (i386, Alpha
> EV67), and use a lookup table on others, which is very boring, but
> also faster.

Classic mistake. Lookup tables are only faster in benchmarks, they are
almost always slower in real life. You only need to miss in the cache
_once_ on the lookup to lose all the time you won on the previous one
hundred calls.

"Small and simple" is almost always better than the alternatives. I
suspect that's one reason why older versions of gcc often generate code
that actually runs faster than newer versions: the newer versions _look_
like they do a better job, but..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/