Re: Compiling x86 with and without frame pointer

Mark Mielke (mark@mark.mielke.cc)
Thu, 21 Nov 2002 00:06:07 -0500


A few weeks ago I was surprised to find that code compiled with
-fomit-frame-pointers reliably executed a few percentages slower.
Since the functions I was testing were not anywhere big enough to
fill even the I1 cache, I wrote it off as 'the CPU is obviously
optimized to expect certain instruction sequences after call and
before ret'. Something to think about anyways...

mark

On Thu, Nov 21, 2002 at 03:47:13PM +1100, Keith Owens wrote:
> The conventional wisdom is that compiling x86 without frame pointer
> results in smaller code. It turns out to be the opposite, compiling
> with frame pointers results in a smaller kernel. gcc version 3.2
> 20020822 (Red Hat Linux Rawhide 3.2-4).
>
> # size 2.4.20-rc2-*/vmlinux
> text data bss dec hex filename
> 2669584 337972 402697 3410253 34094d 2.4.20-rc2-fp/vmlinux
> 2676919 337972 402697 3417588 3425f4 2.4.20-rc2-nofp/vmlinux
>
> Without frame pointers, vmlinux is 7K bigger. The difference is that
> code with frame pointers can use ebp to directly access the stack,
> without frame pointers it has to use esp with an index.
>
> With frame pointers:
>
> 00000c10 <inet_dgram_connect>:
> c10: 55 push %ebp
> c11: 89 e5 mov %esp,%ebp
> c13: 83 ec 14 sub $0x14,%esp
> c16: 89 75 fc mov %esi,0xfffffffc(%ebp)
> c19: 8b 45 08 mov 0x8(%ebp),%eax
> c1c: 8b 75 0c mov 0xc(%ebp),%esi
> c1f: 89 5d f8 mov %ebx,0xfffffff8(%ebp)
> c22: 8b 58 18 mov 0x18(%eax),%ebx
> c25: 66 83 3e 00 cmpw $0x0,(%esi)
> c29: 74 3d je c68 <inet_dgram_connect+0x58>
>
> Without frame pointers:
>
> 00000c10 <inet_dgram_connect>:
> c10: 83 ec 14 sub $0x14,%esp
> c13: 8b 44 24 18 mov 0x18(%esp,1),%eax
> c17: 89 74 24 10 mov %esi,0x10(%esp,1)
> c1b: 8b 74 24 1c mov 0x1c(%esp,1),%esi
> c1f: 89 5c 24 0c mov %ebx,0xc(%esp,1)
> c23: 8b 58 18 mov 0x18(%eax),%ebx
> c26: 66 83 3e 00 cmpw $0x0,(%esi)
> c2a: 74 44 je c70 <inet_dgram_connect+0x60>
>
> The difference is that stack accesses via ebp are 3 bytes, stack
> accesses via esp+index are 4 bytes. On any function with a large
> number of stack accesses, this quickly outweighs the extra prologue
> code for frame pointers.
>
> The smaller instruction set will improve icache usage. Whether this is
> offset by the increased register pressure is something for
> benchmarking. Any of the benchmarkers care to test x86 kernels with
> and without frame pointers?
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them...

http://mark.mielke.cc/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/


58 1 0
99
0 0 0 0 811688 3828 103516 0 0 0 1 175 96 1 0
99
0 0 0 0 811688 3828 103516 0 0 0 0 171 90 0 1
99
0 0 0 0 811924 3828 103516 0 0 0 0 173 204 1 5
94
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
1 0 0 0 811924 3828 103516 0 0 0 0 174 90 1 0
99
0 0 0 0 811924 3828 103516 0 0 0 24 182 93 1 0
99
1 0 0 0 811924 3828 103516 0 0 0 0 173 142 0 1
99
0 0 0 0 811924 3828 103516 0 0 0 0 171 93 1 0
99
0 0 0 0 811924 3828 103516 0 0 0 0 175 94 0 1
99
1 0 0 0 811924 3828 103516 0 0 0 2 173 92 1 0
99
0 0 0 0 811924 3828 103516 0 0 0 0 175 151 1 0
99
0 0 0 0 811924 3828 103516 0 0 0 0 173 87 1 0
99
0 0 1 0 811924 3828 103516 0 0 0 0 173 89 1 0
99
0 0 0 0 811924 3828 103516 0 0 0 1 171 142 0 1
99

bash# vmstat 1
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
5 0 1 0 815896 3656 102228 0 0 156 1076 129 917 6 34
60
2 0 1 0 812860 3656 102992 0 0 737 0 252 1015 36 40
25
0 0 0 0 813104 3656 102992 0 0 0 0 187 129 2 1
97
0 0 0 0 813180 3656 102996 0 0 4 0 168 93 0 1
99
6 8 1 0 802392 3656 103220 0 0 222 0 251 757 62 37
1
0 10 0 0 804588 3684 103252 0 0 133 40 214 871 50 43
8
2 8 1 0 804324 3712 103272 0 0 162 40 222 817 43 40
18
9 4 0 0 805400 3712 103288 0 0 4 0 196 276 13 8
79
9 0 1 0 804724 3812 103348 0 0 644 96 297 1255 41 59
0
14 0 1 0 804144 3816 103344 0 0 0 0 167 888 56 44
0
10 0 1 0 804448 3820 103340 0 0 0 0 171 873 57 43
0
0 1 0 0 812288 3828 103436 0 0 97 0 222 1051 53 42
5
0 0 0 0 811868 3828 103476 0 0 84 0 395 429 0 3
97
1 0 0 0 811868 3828 103476 0 0 0 0 167 91 0 1
99
1 0 0 0 811868 3828 103476 0 0 0 0 167 141 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 177 100 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 171 96 0 1
99
1 0 0 0 811868 3828 103476 0 0 0 0 171 132 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 170 100 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 167 90 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 171 93 1 0
99
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
0 0 0 0 811868 3828 103476 0 0 0 0 170 148 1 0
99
0 0 0 0 811868 3828 103476 0 0 0 0 176 83 0 1
99
0 0 0 0 811868 3828 103476 0 0 0 0 167 86 0 1
99
0 0 0 0 811868 3828 103476 0 0 0 0 169 146 0 1
99
0 0 0 0 811832 3828 103512 0 0 0 173 168 84 1 0
99
0 0 0 0 811832 3828 103512 0 0 0 0 178 104 1 0
99
0 0 0 0 811832 3828 103512 0 0 0 0 167 141 0 1
99
0 0 0 0 811832 3828 103512 0 0 0 0 173 92 1 2
97
0 0 0 0 811832 3828 103512 0 0 0 1 169 89 0 2
98
1 0 0 0 811832 3828 103512 0 0 0 0 173 138 2 3
95
1 0 0 0 811684 3828 103516 0 0 0 0 183 132 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 178 103 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 19 180 108 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 169 148 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 169 88 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 171 94 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 1 170 147 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 168 94 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 171 95 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 173 175 3 0
97
0 0 0 0 811668 3828 103516 0 0 0 15 169 89 1 0
99
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
0 0 0 0 811668 3828 103516 0 0 0 0 168 87 1 0
99
1 0 0 0 811668 3828 103516 0 0 0 0 178 156 2 1
97
0 0 0 0 811668 3828 103516 0 0 0 0 175 122 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 15 167 86 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 171 92 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 177 151 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 169 93 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 1 169 86 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 167 146 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 172 95 0 1
99
0 0 0 0 811668 3828 103516 0 0 0 0 167 84 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 1 178 189 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 171 89 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 0 171 89 0 1
99
1 0 0 0 811668 3828 103516 0 0 0 0 171 124 1 0
99
0 0 0 0 811668 3828 103516 0 0 0 1 183 127 0 1
99

bash# df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/ram1 125011 85304 33154 73% /
coreserver:/var/cores
74858752 475144 70580928 1%
/var/.automount/cores
/dev/sdj1 7827172 24 7429540 1% /disks/disk10.1
/dev/sdi1 7827172 24 7429540 1% /disks/disk9.1
/dev/sdl1 7827172 24 7429540 1% /disks/disk12.1
/dev/sdk1 7827172 24 7429540 1% /disks/disk11.1
/dev/sdh1 7827172 24 7429540 1% /disks/disk8.1
/dev/sdf1 7827172 24 7429540 1% /disks/disk6.1
/dev/sde1 7827172 24 7429540 1% /disks/disk5.1
/dev/sdb1 7827172 24 7429540 1% /disks/disk2.1
/dev/sdg1 7827172 24 7429540 1% /disks/disk7.1
/dev/sda1 7827172 24 7429540 1% /disks/disk1.1
/dev/sdd1 7827172 24 7429540 1% /disks/disk4.1
/dev/sdc1 7827172 24 7429540 1% /disks/disk3.1
bash#

bash# vmstat 1
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
0 0 0 0 755440 3924 105192 0 0 83 8108 353 655 5 25
69
0 0 1 0 755440 3924 105192 0 0 0 0 167 70 0 0
100
0 0 0 0 755440 3924 105192 0 0 0 24 159 82 0 1
99
0 0 0 0 755052 3924 105192 0 0 0 0 162 133 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 0 156 68 0 1
99
0 0 0 0 755052 3924 105192 0 0 0 0 155 59 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 1 157 124 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 0 174 82 0 1
99
0 0 0 0 755052 3924 105192 0 0 0 0 161 73 0 0
100
1 0 0 0 755052 3924 105192 0 0 0 0 159 101 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 1 155 92 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 0 155 57 0 0
100
0 0 0 0 755052 3924 105192 0 0 0 0 155 67 1 0
99
0 0 0 0 755052 3924 105192 0 0 0 0 155 112 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 6 157 67 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 0 155 62 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 0 157 128 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 0 160 66 0 1
99
0 0 0 0 754440 3924 105192 0 0 0 1 157 72 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 0 155 117 0 0
100
0 0 0 0 754440 3924 105192 0 0 0 0 155 71 0 0
100
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
0 0 0 0 754440 3924 105192 0 0 0 0 157 66 0 0
100
1 0 0 0 754440 3924 105192 0 0 0 1 166 114 0 1
99
0 0 0 0 754440 3924 105192 0 0 0 0 158 93 0 0
100

there are no hangs. On startup, I am doing parallel mje2fs accross all the
drives. 3ware 4-port controller shows that LEDs are ON. I have tried
replacing the controllers but that also does not help ...

Thanks
Manish

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/