Re: copy-bit macro

dinon (dinon@csie.ncu.edu.tw)
Tue, 9 Dec 1997 18:18:50 +0800 (CST)


On Mon, 8 Dec 1997, Colin Plumb wrote:

> I notice that a lot of code contains long lines of
>
> if (input & INPUT_FLAG_FOO)
> output |= OUTPUT_FLAG_FOO;
> if (input & INPUT_FLAG_BAR)
> output |= OUTPUT_FLAG_BAR
> if (input & INPUT_FLAG_BAZ)
> output |= OUTPUT_FLAG_BAZ
> etc.
>
> The GCC output from this on the x86 is full of (slow) jumps.
> Also, jumps inhibit optimization.
> More efficient code, which can also be optimized around more,
> is produced by
>
> output |= ((input / INPUT_FLAG_FOO) & 1) * OUTPUT_FLAG_FOO;
> output |= ((input / INPUT_FLAG_BAR) & 1) * OUTPUT_FLAG_BAR;
> output |= ((input / INPUT_FLAG_BAZ) & 1) * OUTPUT_FLAG_BAZ;
>
> Withot having to define bit-shift amounts, GCC nicely optimizes that
> into, e.g.
> movl %edx,%eax
> sall $17,%eax
> andl $16777216,%eax
> orl %eax,%ecx
>

What option did you pass to gcc? Even with -O6, I can't get the same
result above.

I think it is better to write:
output |= input & INPUT_FLAG_FOO;
output |= input & INPUT_FLAG_BAR;
output |= input & INPUT_FLAG_BAZ;

or just:
output |= input (INPUT_FLAG_FOO | INPUT_FLAG_BAR | INPUT_FLAG_BAZ);

> Might such a copy_bit macro be a useful general kernel utility?
> I know Linus hates slow code, and a jump is slower than just about
> anything.
> --
> -Colin
>

Email: dinon@csie.ncu.edu.tw