From owner-freebsd-net@FreeBSD.ORG Fri Dec 28 04:36:47 2007 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EB5216A420; Fri, 28 Dec 2007 04:36:47 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail16.syd.optusnet.com.au (mail16.syd.optusnet.com.au [211.29.132.197]) by mx1.freebsd.org (Postfix) with ESMTP id 30BB213C459; Fri, 28 Dec 2007 04:36:47 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail16.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lBS4adHL019236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Dec 2007 15:36:42 +1100 Date: Fri, 28 Dec 2007 15:36:39 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mark Fullmer In-Reply-To: <985A3F99-B3F4-451E-BD77-E2EB4351E323@eng.oar.net> Message-ID: <20071228143411.C3587@besplex.bde.org> References: <20071221234347.GS25053@tnn.dglawrence.com> <20071222050743.GP57756@deviant.kiev.zoral.com.ua> <20071223032944.G48303@delplex.bde.org> <985A3F99-B3F4-451E-BD77-E2EB4351E323@eng.oar.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Kostik Belousov , freebsd-net@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: Packet loss every 30.999 seconds X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Dec 2007 04:36:47 -0000 On Sat, 22 Dec 2007, Mark Fullmer wrote: > On Dec 22, 2007, at 12:08 PM, Bruce Evans wrote: >> >> I still don't understand the original problem, that the kernel is not >> even preemptible enough for network interrupts to work (except in 5.2 >> where Giant breaks things). Perhaps I misread the problem, and it is >> actually that networking works but userland is unable to run in time >> to avoid packet loss. > > The test is done with UDP packets between two servers. The em > driver is incrementing the received packet count correctly but > the packet is not making it up the network stack. If > the application was not servicing the socket fast enough I would > expect to see the "dropped due to full socket buffers" (udps_fullsock) > counter incrementing, as shown by netstat -s. I couldn't see any sign of PREEMPTION not working in 6.3-PREREALEASE. em seemed to keep up with the maximum rate that I can easily generate (640 kpps with tiny udp packets), though it cannot transmit at more than 400 kpps on the same hardware. This is without aby syncer activity to cause glitches. The rest of the system couldn't keep up, and with my normal configuration of net.isr.direct=1, systat -ip (udps_fullsock) showed too many packets being dropped, but all the numbers seemed to add up right. (I didn't do end-to-end packet counts. I'm using ttcp to send and receive packets; the receiver loses so many packets that it rarely terminates properly, and when it does terminate it always shows many dropped.) However, with net.isr.direct=0, packets are dropped with no sign of the problem except a reduced count of good packets in systat -ip. Packet rate counter net.isr.direct=1 net.isr.direct=0 ------------------- ---------------- ---------------- netstat -I 639042 643522 (faster later) systat -ip (total rx) 639042 382567 (dropped many b4 here) (UDP total) 639042 382567 (udps_fullsock) 298911 70340 (diff of prev 2) 340031 312227 (300+k always dropped) net.isr.count small large (seems to be correct 643k) net.isr.directed large (correct?) no change net.isr.queued 0 0 net.isr.drop 0 0 net.isr.direct=0 is apparently causing dropped packets without even counting them. However, the drop seems to be below the netisr level. More worryingly, with full 1500-byte packets (1472 data + 28 UDP header), packets can be sent at a rate of 76 kpps (nearly 950 Mbps) with a load of only 80% on the receiver, yet the ttcp receiver still drops about 1000 pps due top "socket buffer full". With net.usr.direct=0 it drops an additinal 700 pps due to this. Glitches from sync(2) taking 25 ms increase the loss by about 1000 packets, and using rtprio for the ttcp receiver doesn't seem to help at all. In previous mail, you (Mark) wrote: # With FreeBSD 4 I was able to run a UDP data collector with rtprio set, # kern.ipc.maxsockbuf=20480000, then use setsockopt() with SO_RCVBUF # in the application. If packets were dropped they would show up # with netstat -s as "dropped due to full socket buffers". # # Since the packet never makes it to ip_input() I no longer have # any way to count drops. There will always be corner cases where # interrupts are lost and drops not accounted for if the adapter # hardware can't report them, but right now I've got no way to # estimate any loss. I tried using SO_RCVBUF in ttcp (it's an old version of ttcp that doesn't have an option for this). With the default kern.ipc.maxsockbuf of 256K, this didn't seem to help. 20MB should work better :-) but I didn't try that. I don't understand how fast the socket buffer fills up and would have thought that 256K was enough for tiny packets but not for 1500-byte packets. Their seems to be a general problem that 1Gbps NICs have or should have rings of size >= 256 or 512 so that they aren't forced to drop packets when their interrupt handler has a reasonable but larger latency, yet if we actually use this feature then we flood the upper layers with hundreds of packets and fill up socket buffers etc. there. Bruce