Scalability problem (kmap_lock) with -aa kernels

Martin J. Bligh (Martin.Bligh@us.ibm.com)
Mon, 18 Mar 2002 20:25:42 -0800


OK, I finally got the -aa kernel series running in conjunction with the
NUMA-Q discontigmem stuff. For some reason which I haven't debugged
yet 2.4.19-pre3-aa2 won't boot on the NUMA-Q even without the discontigmem
stuff in ... so I went back to 2.4.19-pre1-aa1, which I knew worked from
last time around (thanks again for that patch).

So just comparing aa+discontigmem to standard 2.4.18+discontigmem, I see
kernel compile times are about 35s vs 26.5s .... hmmm. Looking at the top
part of the profiles, I see this:

standard:

23991 total 0.0257
7679 default_idle 147.6731
3044 _text_lock_dcache 8.7221
2340 _text_lock_swap 43.3333
1160 do_anonymous_page 3.4940
776 d_lookup 2.8116
650 __free_pages_ok 1.2405
627 lru_cache_add 6.8152
608 do_generic_file_read 0.5468
498 __generic_copy_from_user 4.7885
480 lru_cache_del 21.8182
437 atomic_dec_and_lock 6.0694
426 schedule 0.3017
402 _text_lock_dec_and_lock 16.7500
...
109 kmap_high 0.3028
46 _text_lock_highmem 0.4071

andrea:
38549 total 0.0405
13102 _text_lock_highmem 108.2810
8627 default_idle 165.9038
2578 kunmap_high 14.3222
2556 kmap_high 6.0857
1242 do_anonymous_page 3.2684
1052 _text_lock_swap 22.8696
942 _text_lock_dcache 2.4987
683 do_page_fault 0.4337
587 pte_alloc 1.2332
535 __generic_copy_from_user 5.1442
518 d_lookup 1.8768
443 __free_pages_ok 0.7745
422 lru_cache_add 2.7763

_text_lock_highmem appears to be kmap_lock, looking at dissassembly.
Recompiling with the trusty lockmeter, I see this (on -aa).

33.4% 63.5% 5.4us(7893us) 155us( 16ms)(37.8%) 2551814 36.5% 63.5% 0% kmap_lock_cacheline
17.4% 64.9% 5.7us(7893us) 158us( 16ms)(19.7%) 1275907 35.1% 64.9% 0% kmap_high+0x34
16.0% 62.1% 5.2us( 982us) 152us( 13ms)(18.1%) 1275907 37.9% 62.1% 0% kunmap_high+0x40

Ick. On a vaguely comparible mainline kernel we're looking at:

1.6% 2.7% 0.5us(4208us) 28us(3885us)(0.14%) 716044 97.3% 2.7% 0% kmap_lock
1.2% 2.9% 0.9us(4208us) 35us(3885us)(0.09%) 358022 97.1% 2.9% 0% kmap_high+0x10
0.33% 2.5% 0.2us( 71us) 21us(2598us)(0.05%) 358022 97.5% 2.5% 0% kunmap_high+0xc

Andrea - is this your new highmem pte stuff doing this?
Or is that not even in your tree as yet? Would be a shame if that's
the problem as I really want to get the highmem pte stuff - allows
me to put processes pagetables on their own nodes ....

Thanks,

Martin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/