Re: Kernel deadlock using nbd over acenic driver.

Steven Whitehouse (steve@gw.chygwyn.com)
Tue, 14 May 2002 17:32:27 +0100 (BST)


Hi,

The TCP stack should auto-tune the amount of memory that it uses, so that
SO_SNDBUF, cat >/proc/sys/net/core/[rw]mem_default etc. is not required. The
important settings for TCP sockets are only /proc/sys/net/ipv4/tcp_[rw]mem
and tcp_mem I think (at least if I've understood the code correctly).

Since I think we are talking about only a single nbd device, there should
only be a single socket thats doing lots of I/O in this case, or is this
machine doing other heavy network tasks ?
>
> But how to avoid system hangs due to running out of memory?
> Is there a safe guide line? Generally slow is tolerable, but
> crash is not.
>
I agree. I also think your earlier comments about the buffer flushing are
correct as being the most likely cause.

I don't think the system has "run out" exactly, more just got itself into
a state where the code path writing out dirty blocks has been blocked
due to lack of freeable memory at that moment and where the process
freeing up memory has blocked waiting for the nbd device. It may well
be that there is freeable memory, just that for whatever reason no
process is trying to free it.

The LVM team has had a similar problem in dealing with I/O which needs
extra memory in order to complete, so I'll ask them for some ideas. Also
I'm going to try and come up with some patches to eliminate some of the
possible theories so that we can narrow down the options,

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/