From owner-freebsd-hackers Sat Dec 1 13:21:14 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 7E92E37B419 for ; Sat, 1 Dec 2001 13:21:07 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id fB1LL5w36881; Sat, 1 Dec 2001 13:21:05 -0800 (PST) (envelope-from dillon) Date: Sat, 1 Dec 2001 13:21:05 -0800 (PST) From: Matthew Dillon Message-Id: <200112012121.fB1LL5w36881@apollo.backplane.com> To: Richard Sharpe Cc: freebsd-hackers@FreeBSD.ORG Subject: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)) References: <20011128153817.T61580@monorchid.lemis.com> <15364.38174.938500.946169@caddis.yogotech.com> <20011128104629.A43642@walton.maths.tcd.ie> <5.1.0.14.1.20011130181236.00a80160@postamt1.charite.de> <200111302047.fAUKlT811090@apollo.backplane.com> <200111302130.fAULUU324648@apollo.backplane.com> <3C08CF9D.2030109@ns.aus.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Richard (and others), please try this patch. With this patch I get the following between two machines connected via a 100BaseTX switch (full duplex): ---------------- test1:/home/dillon/dbench> ./tbench 1 test2 .1 clients started ..............+* Throughput 6.13925 MB/sec (NB=7.67406 MB/sec 61.3925 MBit/sec) 1 procs test1:/home/dillon/dbench> ./tbench 2 test2 ..2 clients started ............................++** Throughput 8.37795 MB/sec (NB=10.4724 MB/sec 83.7795 MBit/sec) 2 procs ---------------- On localhost I get: ---------------- test1:/home/dillon/dbench> ./tbench 1 localhost .1 clients started ..............+* Throughput 25.7156 MB/sec (NB=32.1445 MB/sec 257.156 MBit/sec) 1 procs test1:/home/dillon/dbench> ./tbench 2 localhost ..2 clients started ............................++** Throughput 36.5428 MB/sec (NB=45.6785 MB/sec 365.428 MBit/sec) 2 procs test1:/home/dillon/dbench> ---------------- This is WITHOUT changing the default send and receive tcp buffers.. they're both 16384. The bug I found is that when recv() is used with MSG_WAITALL, which is what tbench does, soreceive() will block waiting for all available input WITHOUT ever calling pr->pr_usrreqs->pru_rcvd(), which means that if the sender filled up the receive buffer (16K default) the receiver will never ack the 0 window... that is until the idle code takes over after 5 seconds. -Matt Index: uipc_socket.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.68.2.16 diff -u -r1.68.2.16 uipc_socket.c --- uipc_socket.c 2001/06/14 20:46:06 1.68.2.16 +++ uipc_socket.c 2001/12/01 21:09:13 @@ -910,6 +910,14 @@ !sosendallatonce(so) && !nextrecord) { if (so->so_error || so->so_state & SS_CANTRCVMORE) break; + /* + * The window might have closed to zero, make + * sure we send an ack now that we've drained + * the buffer or we might end up blocking until + * the idle takes over (5 seconds). + */ + if (pr->pr_flags & PR_WANTRCVD && so->so_pcb) + (*pr->pr_usrreqs->pru_rcvd)(so, flags); error = sbwait(&so->so_rcv); if (error) { sbunlock(&so->so_rcv); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message