Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 1 Dec 2001 13:21:05 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Richard Sharpe <sharpe@ns.aus.com>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))
Message-ID:  <200112012121.fB1LL5w36881@apollo.backplane.com>
References:  <20011128153817.T61580@monorchid.lemis.com> <15364.38174.938500.946169@caddis.yogotech.com> <20011128104629.A43642@walton.maths.tcd.ie> <5.1.0.14.1.20011130181236.00a80160@postamt1.charite.de> <200111302047.fAUKlT811090@apollo.backplane.com> <200111302130.fAULUU324648@apollo.backplane.com> <3C08CF9D.2030109@ns.aus.com>

next in thread | previous in thread | raw e-mail | index | archive | help
    Richard (and others), please try this patch.  With this patch I
    get the following between two machines connected via a 100BaseTX
    switch (full duplex):

	----------------

    test1:/home/dillon/dbench> ./tbench 1 test2
    .1 clients started
    ..............+*
    Throughput 6.13925 MB/sec (NB=7.67406 MB/sec  61.3925 MBit/sec)  1 procs
    test1:/home/dillon/dbench> ./tbench 2 test2
    ..2 clients started
    ............................++**
    Throughput 8.37795 MB/sec (NB=10.4724 MB/sec  83.7795 MBit/sec)  2 procs

	----------------

     On localhost I get:

	----------------

    test1:/home/dillon/dbench> ./tbench 1 localhost
    .1 clients started
    ..............+*
    Throughput 25.7156 MB/sec (NB=32.1445 MB/sec  257.156 MBit/sec)  1 procs
    test1:/home/dillon/dbench> ./tbench 2 localhost
    ..2 clients started
    ............................++**
    Throughput 36.5428 MB/sec (NB=45.6785 MB/sec  365.428 MBit/sec)  2 procs
    test1:/home/dillon/dbench> 

	----------------

    This is WITHOUT changing the default send and receive tcp buffers..
    they're both 16384.

    The bug I found is that when recv() is used with MSG_WAITALL, 
    which is what tbench does, soreceive() will block waiting for all
    available input WITHOUT ever calling pr->pr_usrreqs->pru_rcvd(),
    which means that if the sender filled up the receive buffer (16K default)
    the receiver will never ack the 0 window... that is until the idle code
    takes over after 5 seconds.

					-Matt

Index: uipc_socket.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.68.2.16
diff -u -r1.68.2.16 uipc_socket.c
--- uipc_socket.c	2001/06/14 20:46:06	1.68.2.16
+++ uipc_socket.c	2001/12/01 21:09:13
@@ -910,6 +910,14 @@
 		    !sosendallatonce(so) && !nextrecord) {
 			if (so->so_error || so->so_state & SS_CANTRCVMORE)
 				break;
+			/*
+			 * The window might have closed to zero, make
+			 * sure we send an ack now that we've drained
+			 * the buffer or we might end up blocking until
+			 * the idle takes over (5 seconds).
+			 */
+			if (pr->pr_flags & PR_WANTRCVD && so->so_pcb)
+				(*pr->pr_usrreqs->pru_rcvd)(so, flags);
 			error = sbwait(&so->so_rcv);
 			if (error) {
 				sbunlock(&so->so_rcv);

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200112012121.fB1LL5w36881>