From owner-freebsd-hackers@FreeBSD.ORG Tue Dec 6 01:51:33 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1D76E16A41F; Tue, 6 Dec 2005 01:51:33 +0000 (GMT) (envelope-from craig@tobuj.gank.org) Received: from ion.gank.org (ion.gank.org [69.55.238.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id A31F943D46; Tue, 6 Dec 2005 01:51:32 +0000 (GMT) (envelope-from craig@tobuj.gank.org) Received: by ion.gank.org (mail, from userid 1001) id 185E12D405; Mon, 5 Dec 2005 19:51:32 -0600 (CST) Date: Mon, 5 Dec 2005 19:51:29 -0600 From: Craig Boston To: freebsd-hackers@freebsd.org Message-ID: <20051206015129.GA34415@nowhere> Mail-Followup-To: Craig Boston , freebsd-hackers@freebsd.org, John Baldwin References: <20051130020734.GA6577@nowhere> <200512020817.55769.jhb@freebsd.org> <20051203005104.GA22567@nowhere> <200512031630.59476.jhb@freebsd.org> <20051204004131.GA7596@nowhere> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051204004131.GA7596@nowhere> User-Agent: Mutt/1.4.2.1i Cc: Subject: Re: Weird PCI interrupt delivery problem X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Dec 2005 01:51:33 -0000 On Sat, Dec 03, 2005 at 06:41:31PM -0600, Craig Boston wrote: > I'll keep hacking on it and follow up here if I figure anything out. Following up, I have made some interesting progress. With the ACPI timer disabled (debug.acpi.disabled=timer), the ACPI+APIC case now behaves the same as the plain APIC case. Each IRQ gets anywhere from 10,000-500,000 interrupts before it simply stops working. Switching the timecounter from ACPI-fast to something else after boot also improves the situation, but not as much as disabling the timer entirely. Up to about 50,000 but better than the <50 it would get otherwise. Switching the timecounter does not bring back any IRQs that have already "died". Disabling the timer does not change the +ACPI -APIC case, but I've been experimenting with that mode of operation and discovered it's not quite as it fist appeared. It's difficult to tell which device is producing interrupts since they all go to IRQ 11. I didn't notice before, but with +ACPI -APIC, the USB controller works fine (indefinitely). Also, I'm not 100% sure, but I don't think the devices on pci9 are producing _ANY_ interrupts at all. With the APIC enabled, rl0 sometimes lasts long enough to get a lease, but with it disabled it has yet to manage that. Also, the irq 11 count looks too low for multiple devices. I compared the entire PCI configuration space for the bridge with ACPI enabled and disabled and they were identical. The only thing that struck me as suspect is that the secondary status register for the bridge has the received master abort bit set, however that happens even when things are working so I'm not sure it's relevant. I get the same results after "fixing" the _PRT for bus 9. I tried both hardcoding the interrupts to 0xb and routing them via various link devices -- no luck. That's all really academic; I suspect ACPI+APIC is the only configuration that has a chance of actually working, and I'm halfway there... Craig