Re: [RFC][PATCH] Faster generic_fls

Daniel Phillips (dphillips@sistina.com)
Wed, 30 Apr 2003 15:51:54 +0200


On Wednesday 30 April 2003 09:19, Willy Tarreau wrote:
> Oh, and the new2 function is another variant I have been using by the past,
> but it is slower here.

It's really cute.

> It performed well on older machines with gcc 2.7.2.
> Its main problem may be the code size, due to the 32 bits masks. Your code
> has the advantage of being short and working on small data sets.

My code just looks small. At -O2 it expands to 4 times the size of the
existing fls and 5 times at -O3. The saving grace is that, being faster, it
doesn't have to be inline, saving code size overall.

> [code]
>
> Here are the results...

The overall trend seems to be that my version is faster with newer compilers
and higher optimization levels, and overall, the newer compilers are
producing faster code. The big anomally is where the original version turns
in the best time of the lot, running on PIV, compiled at -O3 with 2.95.3, as
you pointed out.

Well, it raises interesting questions, but I'm not entirely sure what they are
;-)

Regards,

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/