From owner-cvs-src@FreeBSD.ORG Wed May 2 17:28:41 2007 Return-Path: X-Original-To: cvs-src@freebsd.org Delivered-To: cvs-src@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 000EE16A401; Wed, 2 May 2007 17:28:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 7516213C4AE; Wed, 2 May 2007 17:28:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l42HSbmM033056; Wed, 2 May 2007 13:28:38 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Nate Lawson Date: Wed, 2 May 2007 13:14:14 -0400 User-Agent: KMail/1.9.6 References: <200705020615.l426FDo7015874@repoman.freebsd.org> <4638BAC9.7000603@root.org> <4638BE29.1020505@root.org> In-Reply-To: <4638BE29.1020505@root.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200705021314.15733.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 02 May 2007 13:28:38 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3195/Wed May 2 05:34:51 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: cvs-src@freebsd.org, Darren Reed , src-committers@freebsd.org, cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/kern kern_intr.c src/sys/sys interrupt.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 17:28:41 -0000 On Wednesday 02 May 2007 12:36:57 pm Nate Lawson wrote: > Nate Lawson wrote: > > John Baldwin wrote: > >> On Wednesday 02 May 2007 03:07:07 am Darren Reed wrote: > >>> On Wed, May 02, 2007 at 06:15:13AM +0000, Nate Lawson wrote: > >>>> njl 2007-05-02 06:15:13 UTC > >>>> > >>>> FreeBSD src repository > >>>> > >>>> Modified files: (Branch: RELENG_6) > >>>> sys/kern kern_intr.c > >>>> sys/sys interrupt.h > >>>> Log: > >>>> MFC: rate-check the interrupt storm message and bump the counter 500 -> > >> 1000 > >>> Is this number, "500" or "1000" somehow "magical" for modern hardware? > >>> > >>> If I had a 500MHZ, 1GHz, 1.5GHz, 2GHz, 2.5GHz machines, each with the > >>> appropriate architecture, what would the correct value for this be? > >>> Is i always 1000 or should it be calculated? > >> It's a SWAG and tunable for machines where it doesn't work. In practice the > >> old setting seemed to be a bit too trigger-happy as I know my printer always > >> triggered it, for example. > >> > > > > There's more to it than just your Ghz number. It's a counter of the > > number of times an interrupt has triggered while the previous one was > > being serviced. The faster your kernel, the lower the number could be. > > > > I have a slow early SMP Celeron system with a dc(4) adapter with 4 ports > > sharing an irq with my ata. At 3 am, the nightly script kicks off > > enough IO that it triggers a bug in my dc(4) card that causes it to mask > > the interrupt too long. Then, the irq storm suppression logic kicked > > in, causing ata to timeout the request. The drive is on a mirror so I'd > > lose half the mirror, then rebuild in the morning. With this value > > bumped, I don't have that problem any more but the real issue is why > > dc(4) is being so quirky under heavy shared irq load. > > > > This is on 6.x btw. Is there any reason why our retries is so low? > > sys/dev/ata/ata-disk.c: request->retries = 2; At work we up the timeout from 5 to 30, but we leave retries at 2. > Note that I still got a timeout but it succeeded without error. I think > this is a combination of the dc(4) and highpoint hpt366 driver > interaction. dc(4) is probably holding Giant or something too long and > ata is being too sensitive to the slow hw. Neither dc(4) nor ata(4) hold Giant, FWIW. -- John Baldwin