Re: Kernel deadlock using nbd over acenic driver

Steven Whitehouse (steve@gw.chygwyn.com)
Fri, 17 May 2002 09:44:51 +0100 (BST)


Hi,

>
> Sorry I didn't pick this up earlier ..
>
> "Steven Whitehouse wrote:"
> > we don't want to alter that. The "priority inversion" that I mentioned occurs
> > when you get processes without PF_MEMALLOC set calling nbd_send_req() as when
>
> There aren't any processes that call nbd_send_req except the unique
> nbd client process stuck in the protocol loop in the kernel ioctl
> that it entered at startup.
>
Assuming that we are still talking kernel nbd here and not enbd, I think
you've got that backwards. nbd_send_req() is called from do_nbd_request()
which is the block device request function and can therefore be called
from any thread running the disk task queue, which I think would normally
mean that its a thread waiting for I/O as in buffer.c:__wait_on_buffer()

The loop that the ioctl runs only does network receives and thus doesn't
do any allocations of any kind itself. The only worry on the receive side
is that buffers are not available in the network device driver, but this
doesn't seem to be a problem. There are no backed up replies in the
server (we can tell from the socket queue lengths) and we know that we
can still ping clients which are otherwise dead due to the deadlock. I
don't think that at the moment there is any problem on the receive side.

> > they call through to page_alloc.c:__alloc_pages() they won't use any memory
> > once the free pages hits the min mark even though there is memory available
> > (see the code just before and after the rebalance label).
>
> So I think the exact inversion you envisage cannot happen, but ...
>
> I think that the problem is that the nbd-client process doesn't have
> high memory priority, and high priority processes can scream and holler
> all they like and will claim more memory, but won't make anythung better
> because the nbd process can't run (can't get tcp buffers), and so
> can't release the memory pressure.
>
> So I think that your PF_MEMALLOC idea does revert the inversion.
>
> Would it also be good to prevent other processes running? or is it too
> late. Yes, I think it is too late to do any good, by the time we feel
> this pressure.
>
> Peter
>
The mechanism works fine for block devices which do not need to allocate
memory in their write out paths. Since we know there is a maximum amount
of memory required by nbd and bounded by the maximum request size plus the
small header per request, it would seem reasonable that to avoid deadlock
we simply need to raise the amount of memory reserved for low memory
situations until we've provided what nbd needs,

Steve.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/