Re: [Bug report] System lockups on Tyan S2469 and lots of io [smp boot time problems too :(]

Vincent Touquet (vincent.touquet@pandora.be)
Mon, 7 Jul 2003 18:48:31 +0200


Ok, so I forgot the vmstat output too :/
Looks pretty similar to the vmstat output with the hang of 2.4.21.

Using 2.4.19 it takes longer to take the system down, there is a long
time where data actually gets to the array, before the behaviour where
only a few blocks get written out anymore, untill the system hangs.

kalimero:~# cat vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 1 424 9440 11656 943172 0 0 511 521 73 72 0 1 98 0
0 2 424 9568 11676 942972 0 0 3852 24576 365 323 0 4 96 0
1 1 424 9564 11732 943548 0 0 12180 6636 502 816 0 9 91 0
2 1 424 10488 11772 942796 0 0 32032 0 749 2013 0 22 77 0
0 2 552 10376 11816 942984 0 0 26140 53220 671 1646 0 23 76 0
0 2 552 10512 11856 942732 0 0 16788 24448 547 1094 1 20 79 0
0 2 552 10492 11864 942732 0 0 0 20232 262 70 0 2 98 0
1 2 552 10120 11944 942664 0 0 24632 21012 695 1580 1 24 75 0
0 3 552 9164 11948 939296 0 0 30240 16708 789 2101 1 23 76 0
0 2 552 9996 12000 939788 0 0 30628 0 758 2119 0 18 82 0
4 0 552 3324 12076 926332 0 0 23672 40244 605 1451 1 25 74 0
2 3 552 9456 12140 926544 0 0 26532 22560 708 1659 1 22 77 0
2 2 552 8840 12352 938000 0 0 16912 26104 661 1173 3 16 80 0
0 5 552 9308 12560 940076 0 0 19988 21176 693 1503 3 17 80 0
0 3 552 9512 12244 942024 0 0 16912 21820 582 1169 1 15 84 0
0 3 552 9576 12272 941988 0 0 8336 16588 451 593 1 6 92 0
0 3 552 9432 12288 942928 0 0 6920 0 395 463 1 6 93 0
1 1 552 9556 12376 943136 0 0 28400 48644 778 1974 0 22 78 0
0 2 552 9488 12476 943072 0 0 13448 17392 559 972 1 12 87 0
0 2 552 9484 12480 943072 0 0 4 20608 277 61 0 1 99 0
0 2 552 9484 12480 943072 0 0 0 28420 303 72 0 3 97 0
0 2 552 9472 12484 943072 0 0 4 2068 269 82 0 3 97 0
0 2 552 9468 12496 943072 0 0 8 0 261 68 1 0 99 0
2 1 552 9448 12620 944316 0 0 20492 25280 630 1372 0 15 84 0
0 2 552 9524 12684 944080 0 0 26784 41224 663 1731 0 21 79 0
1 1 552 9476 12740 943896 0 0 33472 2900 832 2140 1 22 77 0
0 2 552 9448 12780 943792 0 0 9984 0 393 686 0 7 93 0
1 1 552 9548 12860 943640 0 0 18972 104 443 1245 1 13 86 0
1 1 552 9532 12916 943340 0 0 27676 49432 642 1772 2 21 77 0
0 2 552 9972 12968 942372 0 0 30620 20480 743 1959 0 24 75 0
0 2 680 10308 12968 942060 0 256 7304 10708 371 504 0 8 91 0
1 4 680 10428 13148 941512 0 0 22040 928 681 1585 3 20 77 0
0 1 680 10424 13464 941384 0 0 15508 996 628 1530 3 21 75 0
0 2 680 10540 13504 940960 0 0 9996 41460 376 701 1 14 85 0
0 2 680 8736 13544 936476 0 0 29980 22400 759 1940 0 38 62 0
2 1 680 6504 13576 923180 0 0 17296 22656 552 1227 0 18 82 0
1 2 680 7704 13656 924124 0 0 29644 7108 754 1844 0 22 77 0
0 2 680 19544 13688 927160 0 0 5980 0 356 380 0 4 96 0
1 2 680 9688 13768 940848 0 0 14364 54420 510 780 0 13 87 0
0 2 680 9984 13772 940848 0 0 4 24008 298 77 0 1 99 0
0 2 680 10108 13772 940848 0 0 0 12776 268 66 0 3 97 0
0 2 680 10156 13776 940848 0 0 4 0 262 57 0 1 99 0
0 1 680 9500 13844 942340 0 0 11180 180 377 743 0 8 92 0
0 2 680 9468 13868 942668 0 0 28852 32152 724 1944 1 17 82 0
1 1 680 9480 13920 942588 0 0 8680 0 479 738 0 9 91 0
0 1 680 9468 14000 942572 0 0 20216 148 637 1550 0 13 87 0
4 1 680 9444 14244 942052 0 0 20724 432 564 1598 0 11 89 0
1 0 680 9500 14856 941152 0 0 24588 1036 849 2420 4 19 76 0
0 2 680 9568 14968 940724 0 0 15028 50876 505 1058 0 11 88 0
0 2 680 9556 14972 940728 0 0 4 15260 232 67 0 1 99 0
0 2 680 9520 15048 940228 0 0 30376 16236 764 2029 1 22 77 0
0 2 680 9480 15164 940060 0 0 3176 15012 345 374 1 2 97 0
0 3 680 9388 15512 939804 0 0 344 1788 317 234 0 2 97 0
0 2 680 9480 15660 939796 0 0 5356 944 381 458 0 4 96 0
0 2 680 9440 15704 940148 0 0 31776 47596 729 2005 0 24 76 0
0 2 680 9496 15764 940076 0 0 34084 17712 812 2144 0 26 74 0
0 2 680 9552 15816 939780 0 0 28188 0 708 1807 0 19 81 0
1 1 680 9468 15820 939676 0 0 13072 0 430 864 1 7 92 0
0 2 680 9528 15876 939576 0 0 17424 41548 492 1150 1 14 85 0
1 1 808 10528 15916 938360 0 128 30116 16952 761 1984 1 25 74 0
0 2 808 10368 15600 938420 0 0 32384 24704 813 2081 1 25 74 0
0 4 808 9716 16056 938388 0 0 23464 17940 829 2069 2 28 70 0
0 4 808 9712 16056 938388 0 0 0 0 271 55 0 1 99 0
1 1 808 9452 16112 939096 0 0 23836 452 702 1684 0 16 83 0
0 2 808 7416 16168 924056 0 0 25112 59440 672 1751 0 29 71 0
1 2 808 7524 16216 918320 0 0 24772 17372 665 1631 0 23 77 0
1 3 808 8284 16236 929356 0 0 31260 20736 761 1794 1 19 80 0
0 2 808 12964 16256 930940 0 0 5388 22700 385 396 0 11 89 0
1 2 808 9160 16288 938016 0 0 5160 476 379 298 0 4 96 0
1 2 808 9232 16324 939480 0 0 31272 0 729 1932 1 22 77 0
0 3 808 9284 16312 939376 0 0 29868 53036 744 1916 1 23 76 0
0 3 808 9308 16060 939184 0 0 27720 24540 746 1798 0 21 79 0
0 3 808 9252 16064 939184 0 0 4 24320 289 82 0 4 96 0
1 2 808 9240 16128 938856 0 0 19868 20820 620 1309 0 16 84 0
0 3 808 9280 15784 939652 0 0 24676 6044 664 1594 0 20 80 0
1 3 808 9228 15840 939680 0 0 27300 376 707 1765 10 21 68 0
1 2 808 9292 15888 939256 0 0 24628 0 491 1546 2 18 80 0
0 8 808 9236 15944 938632 0 0 24224 352 574 1796 3 18 79 0
0 5 808 9448 16392 937860 0 0 18328 1144 565 1711 5 18 77 0
IMPORTANT POINT
0 5 808 9444 16392 937860 0 0 0 0 113 98 0 2 98 0
0 5 808 9432 16396 937860 0 0 0 0 108 67 1 2 97 0
0 5 808 9424 16400 937860 0 0 0 0 129 86 0 2 97 0
0 5 808 9404 16400 937860 0 0 0 0 337 506 0 2 97 0
0 6 808 9384 16416 937860 0 0 0 328 164 141 1 1 98 0
0 6 808 9384 16416 937860 0 0 0 0 101 36 1 1 98 0
0 6 808 9368 16416 937860 0 0 0 0 277 384 1 1 98 0
0 6 808 9368 16416 937860 0 0 0 0 105 42 0 1 98 0
0 6 808 9360 16416 937860 0 0 0 0 233 305 1 2 97 0
0 6 808 9328 16428 937860 0 0 0 328 119 41 0 3 97 0
0 6 808 9328 16428 937860 0 0 0 0 101 33 0 4 96 0
0 6 808 9316 16428 937860 0 0 0 0 285 400 0 0 100 0
0 6 808 9312 16428 937860 0 0 0 0 139 121 1 1 98 0
0 6 808 9312 16428 937860 0 0 0 0 105 42 0 0 100 0
0 6 808 9284 16440 937808 0 0 8 328 128 297 1 2 97 0
0 6 808 9284 16440 937808 0 0 0 0 108 50 0 0 100 0
0 6 808 9284 16440 937808 0 0 0 0 114 55 0 0 100 0
0 6 808 9216 16440 937872 0 0 0 0 114 485 0 2 98 0
0 6 808 9344 16440 937744 0 0 0 0 112 61 0 0 100 0
1 6 808 9276 16452 937804 0 0 0 548 132 346 0 2 98 0
0 6 808 9248 16452 937804 0 0 0 0 109 48 0 0 100 0
0 6 808 9248 16452 937804 0 0 0 0 108 46 0 0 100 0
0 6 808 9308 16452 937736 0 0 0 0 105 341 0 1 98 0
0 6 808 9292 16460 937744 0 0 0 0 105 49 0 0 100 0
0 6 808 9276 16476 937744 0 0 0 424 139 323 0 1 99 0
0 6 808 9272 16480 937744 0 0 0 0 104 41 0 1 99 0
0 6 808 9332 16488 937676 0 0 0 0 103 345 0 2 98 0
0 6 808 9324 16496 937676 0 0 0 0 101 32 0 0 100 0
0 6 808 9212 16500 937748 0 0 0 0 103 339 0 2 98 0

The date on the file the vmstat was written to is Jul 7 17:52,
every second it was written (last possible one at 17:52:59, when the
system hung)... That makes the 'important point in time' about 17:52:59 -
29, which is 17:52:30. The closest trace I have is from 17:52:44

The cp process is then already in state D (uninterruptible sleep), it
wasn't in state D in the trace at time [17:50:39], when it was on the
run queue.

Can anyone make sense out of this ? :))

best regards,

Vincent
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/