From owner-freebsd-net@FreeBSD.ORG Wed Dec 19 16:49:52 2007 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D589B16A419; Wed, 19 Dec 2007 16:49:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail34.syd.optusnet.com.au (mail34.syd.optusnet.com.au [211.29.133.218]) by mx1.freebsd.org (Postfix) with ESMTP id 77EE913C465; Wed, 19 Dec 2007 16:49:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail34.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lBJGnk3A028545 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 20 Dec 2007 03:49:47 +1100 Date: Thu, 20 Dec 2007 03:49:45 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: David G Lawrence In-Reply-To: <20071219151926.GA25053@tnn.dglawrence.com> Message-ID: <20071220032223.V38101@delplex.bde.org> References: <20071217103936.GR25053@tnn.dglawrence.com> <20071218170133.X32807@delplex.bde.org> <47676E96.4030708@samsco.org> <20071218233644.U756@besplex.bde.org> <20071218141742.GS25053@tnn.dglawrence.com> <20071219022102.I34422@delplex.bde.org> <20071218165732.GV25053@tnn.dglawrence.com> <20071218181023.GW25053@tnn.dglawrence.com> <20071219235444.K928@besplex.bde.org> <20071219151926.GA25053@tnn.dglawrence.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Packet loss every 30.999 seconds X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 16:49:52 -0000 On Wed, 19 Dec 2007, David G Lawrence wrote: >> Debugging shows that the problem is like I said. The loop really does >> take 125 ns per iteration. This time is actually not very much. The > > Considering that the CPU clock cycle time is on the order of 300ps, I > would say 125ns to do a few checks is pathetic. As I said, 125 nsec is a short time in this context. It is approximately the time for a single L2 cache miss on a machine with slow memory like freefall (Xeon 2.8 GHz with L2 cache latency of 155.5 ns). As I said, the code is organized so as to give about 4 L2 cache misses per vnode if there are more than a few thousand vnodes, so it is doing very well to take only 125 nsec for a few checks. > In any case, it appears that my patch is a no-op, at least for the > problem I was trying to solve. This has me confused, however, because at > one point the problem was mitigated with it. The patch has gone through > several iterations, however, and it could be that it was made to the top > of the loop, before any of the checks, in a previous version. Hmmm. The patch should work fine. IIRC, it yields voluntarily so that other things can run. I committed a similar hack for uiomove(). It was easy to make syscalls that take many seconds (now tenths of seconds insted of seconds?), and without yielding or PREEMPTION or multiple CPUs, everything except interrupts has to wait for these syscalls. Now the main problem is to figure out why PREEMPTION doesn't work. I'm not working on this directly since I'm running ~5.2 where nearly-full kernel preemption doesn't work due to Giant locking. Bruce