From owner-freebsd-scsi@FreeBSD.ORG Mon Apr 10 18:04:38 2006 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EF79E16A404 for ; Mon, 10 Apr 2006 18:04:38 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id CD1D643D48 for ; Mon, 10 Apr 2006 18:04:35 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.4/8.13.4) with ESMTP id k3AI4UO7009092; Mon, 10 Apr 2006 14:04:31 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Oleg Sharoiko Date: Mon, 10 Apr 2006 14:01:10 -0400 User-Agent: KMail/1.9.1 References: <20060215102749.D58480@brain.cc.rsu.ru> <20060328201134.S763@brain.cc.rsu.ru> <20060406223724.S1099@wolf.os.rsu.ru> In-Reply-To: <20060406223724.S1099@wolf.os.rsu.ru> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604101401.12479.jhb@freebsd.org> X-Virus-Scanned: ClamAV 0.87.1/1389/Mon Apr 10 08:58:55 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on server.baldwin.cx Cc: freebsd-scsi@freebsd.org, Andrey Beresovsky Subject: Re: Boot hangs on ips0: resetting adapter, this may take up to 5 minutes X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Apr 2006 18:04:39 -0000 On Thursday 06 April 2006 15:07, Oleg Sharoiko wrote: > Hi, that's me again. > > John, I've got more information on my problem: > > It looks like the mis-routed interrupt is the one from ips. In my kernel > ips is on vector 49 and bge is on vector 60. I've added > > if (vector == 60) > vector = 49; > > to sys/amd64/amd64/local_apic.c and I have no more interrupt storm until > bge really generates interrupt. Am I right with my conclusion about ips > interrupt being mis-directed to bge? Well, the vectors is the wrong thing to mess with as vector's are IDT entries. > There's also another interesting point: it looks like ips triggers > interrupt on both vectors (49 and 60 - irq 28 and irq 16). Why do I think > so? This happens in several machines with Intel server chipsets due to a bug in the PXH host bridges with no real workaround. > 1. ips works fine even when there's no bge in kernel (I suppose irq 16 is > not activated in this case). I suppose this should mean that interrupts > are properly delivered to ips driver. > > 2. I've added debug printf to bge_intr and in single mode when preemption > is disabled I see exact the same number of interrupts delivered to ips > (checked counters with showintrcnt) and to bge (incorrectly delivered - > bge is not in UP state and bge registers say "no interrupt"). > > This seems really strange to me, how can this be possible? Is there any > way to fix this? One thing you can do w/o hacking the code is to reroute ips0 to IRQ 16. Find the dmesg line for ips0, it should say something like: ips0 <...> ... at device 4.0 on pci2 These numbers (4 from '4.0' and 2 from 'pci2') are the slot and bus for ips0. We'll assume INTA is being used as single function cards using INTA. Then, set a tunable like so in the loader to force ips0 to use IRQ 16: 'set hw.pci2.4.INTA.irq=16' This may not work for 6.0 but should work for 6.1 and later. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org