From owner-freebsd-stable@FreeBSD.ORG Wed Sep 27 21:24:58 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D2D7216A5FB; Wed, 27 Sep 2006 21:24:58 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9AB8943DFB; Wed, 27 Sep 2006 21:21:48 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [10.10.3.185] ([165.236.175.187]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id k8RLK9PD020837; Wed, 27 Sep 2006 15:20:14 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <451AEB02.2090806@samsco.org> Date: Wed, 27 Sep 2006 15:20:02 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060206 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David G Lawrence References: <451A1375.5080202@gneto.com> <20060927071538.GF22229@e-Gitt.NET> <451A4189.5020906@samsco.org> <20060927152824.GJ22229@e-Gitt.NET> <20060927155553.GB14563@icarus.home.lan> <20060927155904.GM22229@e-Gitt.NET> <451AA7B1.5080202@samsco.org> <20060927191402.GB932@turion.vk2pj.dyndns.org> <20060927210349.GG14975@tnn.dglawrence.com> In-Reply-To: <20060927210349.GG14975@tnn.dglawrence.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=3.8 tests=none autolearn=failed version=3.1.1 X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on pooker.samsco.org Cc: Peter Jeremy , freebsd-stable@freebsd.org, Oliver Brandmueller , John Baldwin Subject: Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Sep 2006 21:24:59 -0000 David G Lawrence wrote: >>In the past (RELENG_5) I've had major problems with syncer delaying >>interrupt threads for long periods (I've seen 8msec). See >>http://lists.freebsd.org/pipermail/freebsd-stable/2005-February/012346.html >>I'm not sure if this is still a problem (but I am still having some >>problems which may be caused by excessive interrupt and will be doing >>some debugging as I get time). > > ... > >>tool and then post-process the file looking for oddities. In my case, >>there was a _very_ high correlation between long latencies and syncer. >>If anyone's interested in this approach, I can provide the relevant >>code diffs. > > > I've seen this problem as well - results in around 9-10ms of occasional > scheduling delay for a real-time streaming application that I'm developing. > Shutting off softupdates on all of the mounted filesystems helps. > Note that the watchdog timeout for the network drivers is usually 8000ms > (8 seconds), so this is unlikely to be related to that problem. > Well, I kinda danced around the issue before, but I'll say it now. I, as well as a few others, have seen instances of Giant being held by the syncer for 5 or more seconds at a time. I can't explain why, and I've never been able to catch it in the act in a meaningful way. But it is known to happen. My best wild guess is that the syncer is doing a lot of work (there is no question here), and keeps on getting preempted, and as part of this, it blocks without locks being dropped. Actually, this is most likely exactly what is going on. The syncer is sending out I/O and is getting interrupted+preempted by the sata controller+driver, and it winds up making very slow progress, while never actually releasing Giant. An easy way to test this would be to turn off preemption. Could someone with this problem remove the 'option PREEMPTION' line in their kernel config and recompile/retest? If this is in fact the root cause, then it indeed has nothing to do with em driver INTR_FAST changes. The easiest fix then becomes the ichsmb and usb driver shims that I talked about. The longer term fix is to continue progress on making the syncer run without Giant and also not do so much work. I think that there should also be some discussion on the locking consequences of preemption. Scott