Re: [RFC][PATCH] Faster generic_fls

Aaron Lehmann (aaronl@vitelus.com)
Wed, 30 Apr 2003 01:36:07 -0700


On Wed, Apr 30, 2003 at 09:19:07AM +0200, Willy Tarreau wrote:

> But gcc's optimizer seems to be even worse with each new version. The NIL
> checksum takes 3 times more to complete on gcc3/athlon than gcc2/athlon !!!
> And on some test, the P4 goes faster when the code is optimized for athlon
> than for P4 !

It would be interesting to compare the assembly output of the
compilers to see what's happening. The gcc folks appreciate test cases
and this function seems simple and sensitive enough to make a good one.

> ===== gcc-2.95.3 -march=i686 -O2 / athlon MP/2200 (1.8 GHz) =====

-march=athlon would give gcc a chance to make better scheduling
decisions, which could make the results more sensible.

> ==== gcc 3.2.3 -march=i686 -O3 / Pentium 4 / 2.0 GHz ====

This version of gcc accepts -march=pentium4, which again might make
gcc's behavior less bizarre.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/