From owner-freebsd-net@FreeBSD.ORG Mon May 12 13:58:01 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33CEA106564A for ; Mon, 12 May 2008 13:58:01 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 9712B8FC2D for ; Mon, 12 May 2008 13:58:00 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 97372 invoked from network); 12 May 2008 12:58:53 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 May 2008 12:58:53 -0000 Message-ID: <48284CE7.2020707@freebsd.org> Date: Mon, 12 May 2008 15:57:59 +0200 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.14 (Windows/20071210) MIME-Version: 1.0 To: Tim Gebbett References: <4822BABB.4020407@freebsd.org> <4824211C.9090105@freebsd.org> <482561F3.6080701@gebbettco.com> In-Reply-To: <482561F3.6080701@gebbettco.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Deng XueFeng , Mark Hills Subject: Re: read() returns ETIMEDOUT on steady TCP connection X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 May 2008 13:58:01 -0000 Tim Gebbett wrote: > Hi Andre, did some careful testing yesterday and last night. I seem to > be still hitting an unknown buffer although the probem is much alleviated. > The system achieved a 7hour run at 500mbit where ETIMEDOUT occured. I > was feeding 11 other streams to the server whos counters show an > uninterrupted eleven hours. The feeder streams are from the same source, > so it is unlikely that the one feeding the test could of had a problem > without affecting the counters of the others. > sysctls are: > > (loader.conf) hw.em.txd=4096 > net.inet.tcp.sendspace=78840 > net.inet.tcp.recvspace=78840 > > kern.ipc.nmbjumbop=51200 > kern.ipc.nmbclusters=78840 > kern.maxfiles=50000 > > IP stats are miraculously improved, going from a 10% packet loss within > stack (output drops) to a consistent zero at peaks of 80000 pps. I > believe the problem is now being shunted to the NIC from the following > output: > > dev.em.0.debug=1 > > < em0: Adapter hardware address = 0xc520b224 > > < em0: CTRL = 0x48f00249 RCTL = 0x8002 < em0: Packet buffer = Tx=16k > Rx=48k < em0: Flow control watermarks high = 47104 low = 45604 > < em0: tx_int_delay = 66, tx_abs_int_delay = 66 > < em0: rx_int_delay = 0, rx_abs_int_delay = 66 > < em0: fifo workaround = 0, fifo_reset_count = 0 > < em0: hw tdh = 3285, hw tdt = 3285 > < em0: hw rdh = 201, hw rdt = 200 > < em0: Num Tx descriptors avail = 4096 > < em0: Tx Descriptors not avail1 = 4591225 > < em0: Tx Descriptors not avail2 = 0 > < em0: Std mbuf failed = 0 > < em0: Std mbuf cluster failed = 0 > < em0: Driver dropped packets = 0 > < em0: Driver tx dma failure in encap = 0 > > dev.em.0.stats=1 > > < em0: Excessive collisions = 0 > > < em0: Sequence errors = 0 > < em0: Defer count = 0 > < em0: Missed Packets = 16581181 > < em0: Receive No Buffers = 74605555 > < em0: Receive Length Errors = 0 > < em0: Receive errors = 0 > < em0: Crc errors = 0 > < em0: Alignment errors = 0 > < em0: Collision/Carrier extension errors = 0 > < em0: RX overruns = 289717 > < em0: watchdog timeouts = 0 > < em0: XON Rcvd = 0 > < em0: XON Xmtd = 0 > < em0: XOFF Rcvd = 0 > < em0: XOFF Xmtd = 0 > < em0: Good Packets Rcvd = 848158221 > < em0: Good Packets Xmtd = 1080368640 > < em0: TSO Contexts Xmtd = 0 > < em0: TSO Contexts Failed = 0 > > > Does the counter 'Tx Descriptors not avail1' indicate lack of > decriptors at the time not available, and would this be symptomatic of > something Mark suggested: > "(the stack) needs to handle local buffer fills not as a failed attempt > on transmission that increments the retry counter, a possible better > strategy required for backoff > when the hardware buffer is full?" Indeed. We have rethink a couple of assumptions the code currently makes and has made for the longest time. Additionally the defaults for the network hardware need to be better tuned for workloads like yours. I'm on my way to BSDCan'08 soon and I will discuss these issues at the Developer Summit. -- Andre