From owner-cvs-sys Mon Jul 6 12:27:40 1998 Return-Path: Received: (from daemon@localhost) by hub.freebsd.org (8.8.8/8.8.8) id MAA02874 for cvs-sys-outgoing; Mon, 6 Jul 1998 12:27:40 -0700 (PDT) (envelope-from owner-cvs-sys) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id MAA02818; Mon, 6 Jul 1998 12:27:24 -0700 (PDT) (envelope-from fenner@FreeBSD.org) From: Bill Fenner Received: (from fenner@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id MAA23940; Mon, 6 Jul 1998 12:27:17 -0700 (PDT) Date: Mon, 6 Jul 1998 12:27:17 -0700 (PDT) Message-Id: <199807061927.MAA23940@freefall.freebsd.org> To: cvs-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG, cvs-sys@FreeBSD.ORG Subject: cvs commit: src/sys/kern uipc_socket.c Sender: owner-cvs-sys@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk fenner 1998/07/06 12:27:16 PDT Modified files: sys/kern uipc_socket.c Log: Introduce (fairly hacky) workaround for odd TCP behavior with application writes of size (100,208]+N*MCLBYTES. The bug: sosend() hands each mbuf off to the protocol output routine as soon as it has copied it, in the hopes of increasing parallelism (see http://www.kohala.com/~rstevens/vanj.88jul20.txt ). This works well for TCP as long as the first mbuf handed off is at least the MSS. However, when doing small writes (between MHLEN and MINCLSIZE), the transaction is split into 2 small MBUF's and each is individually handed off to TCP. TCP assumes that the first small mbuf is the whole transaction, so sends a small packet. When the second small mbuf arrives, Nagle prevents TCP from sending it so it must wait for a (potentially delayed) ACK. This sends throughput down the toilet. The workaround: Set the "atomic" flag when we're doing small writes. The "atomic" flag has two meanings: 1. Copy all of the data into a chain of mbufs before handing off to the protocol. 2. Leave room for a datagram header in said mbuf chain. TCP wants the first but doesn't want the second. However, the second simply results in some memory wastage (but is why the workaround is a hack and not a fix). The real fix: The real fix for this problem is to introduce something like a "requested transfer size" variable in the socket->protocol interface. sosend() would then accumulate an mbuf chain until it exceeded the "requested transfer size". TCP could set it to the TCP MSS (note that the current interface causes strange TCP behaviors when the MSS > MCLBYTES; nobody notices because MCLBYTES > ethernet's MTU). Revision Changes Path 1.41 +2 -1 src/sys/kern/uipc_socket.c