[PATCH] sock_writeable not appropriate for TCP sockets, for 2.4.20

Chuck Lever (cel@citi.umich.edu)
Fri, 30 Aug 2002 16:51:57 -0400 (EDT)


hi marcelo-

sock_writeable determines whether there is space in a socket's output
buffer. socket write_space callbacks use it to determine whether to wake
up those that are waiting for more output buffer space.

however, sock_writeable is not appropriate for TCP sockets. because the
RPC layer's write_space callback uses it for TCP sockets, the RPC layer
hammers on sock_sendmsg with dozens of write requests that are only a few
hundred bytes long when it is trying to send a large write RPC request.
this patch adds logic to the RPC layer's write_space callback that
properly handles TCP sockets.

patch reviewed by Trond and Alexey. patch already sent to linus for 2.5.

--- 2.4.20-pre5/net/sunrpc/xprt.c.orig Fri Aug 30 15:49:28 2002
+++ 2.4.20-pre5/net/sunrpc/xprt.c Fri Aug 30 15:52:55 2002
@@ -950,6 +950,8 @@

/*
- * The following 2 routines allow a task to sleep while socket memory is
- * low.
+ * Called when more output buffer space is available for this socket.
+ * We try not to wake our writers until they can make "significant"
+ * progress, otherwise we'll waste resources thrashing sock_sendmsg
+ * with a bunch of small requests.
*/
static void
@@ -965,6 +967,13 @@

/* Wait until we have enough socket memory */
- if (!sock_writeable(sk))
- return;
+ if (xprt->stream) {
+ /* from net/ipv4/tcp.c:tcp_write_space */
+ if (tcp_wspace(sk) < tcp_min_write_space(sk))
+ return;
+ } else {
+ /* from net/core/sock.c:sock_def_write_space */
+ if (!sock_writeable(sk))
+ return;
+ }

if (!test_and_clear_bit(SOCK_NOSPACE, &sock->flags))

-- 

corporate: <cel at netapp dot com> personal: <chucklever at bigfoot dot com>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/