Re: [PATCH] 2.3.40: <linux/linkage.h> generates incorrect cache

Chris Sears (cbsears@ix.netcom.com)
Thu, 27 Jan 2000 23:27:15 -0800


Werner,

that was pretty quick analysis. I think that what you were saying
is that 3/5% of the kernel footprint is duplicated strings!? Wow.
I certainly wouldn't chase it down, but it is interesting.

Actually, I should have gone a little further with my example.
The gratuitous alignments show up in basic blocks of code
as well.

...
orl %edx,(%ebp)
jmp .L1482
.p2align 4,,7
.L1470:
notl %edx
andl %edx,(%ebp)
.L1482:
addl $4,%ebp
addl $-32,%ebx
.L1468:
xorl %edx,%edx
...

and

.size sys_ioperm,.Lfe2-sys_ioperm
.align 4
.globl sys_iopl
.type sys_iopl,@function
sys_iopl:
pushl %ebx
movl 8(%esp),%ecx

This is probably the entry to a loop. The .p2align 4,,7 says,
if I can trust the documentation, to pad modulo 16,
with an upper bound of 7 bytes. Again, I didn't ask
the compiler for this special service and it seems to be
beyond my control.

What I like is that gcc aligns for function entry points.
You being called from a random place, have the cache
load 32 bytes of the new instruction stream. But function
alignments are unfortunately modulo *4* rather
than to a cache line. .align takes the number of bytes
and .p2align takes log2 the number of bytes.
Well now, that's a good idea.

What I really don't like is alignment for basic blocks,
which strikes me as misguided. What is the point?
It just wastes cache footprint.

Chris Sears
cbsears@ix.netcom

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/