long stalls

Brian Tinsley (btinsley@emageon.com)
Tue, 07 Jan 2003 18:42:27 -0600


We have been having terrible problems with long stalls, meaning from a
couple of minutes to an hour, happening when filesystem I/O load gets
high. The system time as reported by vmstat or sar will increase up to
99% and as it spreads to each procesor, the system becomes completely
unresponsive (except that it responds to pings just fine -
interesting!). When the system finally returns to the world of the
living, the only evidence that something bad has happened is the runtime
for kswapd is abnormally high. I have seen this happen with the stock
2.4.17, 2.4.19, and 2.4.20 kernels on SMP PIII and PIV machines (either
4GB or 8GB RAM, all SCSI disks, dual GigE NICs). I've searched the lkml
archives and google and have found several similar postings, but there
is never an explanation or resolution. Any help would be *very* much
appreciated! If any info from the system in question is desired, I will
be glad to provide it.

-- 

-[========================]- -[ Brian Tinsley ]- -[ Chief Systems Engineer ]- -[ Emageon ]- -[========================]-

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/