From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 24 02:25:16 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5D96116A41F; Tue, 24 Jan 2006 02:25:16 +0000 (GMT) (envelope-from craig@tobuj.gank.org) Received: from ion.gank.org (ion.gank.org [69.55.238.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1EAFD43D48; Tue, 24 Jan 2006 02:25:15 +0000 (GMT) (envelope-from craig@tobuj.gank.org) Received: by ion.gank.org (mail, from userid 1001) id 6611F2AA01; Mon, 23 Jan 2006 20:25:15 -0600 (CST) Date: Mon, 23 Jan 2006 20:25:11 -0600 From: Craig Boston To: John Baldwin Message-ID: <20060124022511.GA99552@nowhere> Mail-Followup-To: Craig Boston , John Baldwin , freebsd-hackers@freebsd.org, Scott Long References: <20060120014307.GA3118@nowhere> <43D07273.6030804@samsco.org> <20060120152731.GA5660@nowhere> <200601201542.23464.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200601201542.23464.jhb@freebsd.org> User-Agent: Mutt/1.4.2.1i Cc: freebsd-hackers@freebsd.org, Scott Long Subject: Re: Weird PCI interrupt delivery problem (resolution, sort of) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2006 02:25:16 -0000 On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote: > On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote: > > This points to a bus coherency problem. I wonder if your BIOS is > > incorrectly setting the memory region of the apics as cachable. You'll > > want to bug Baldwin about this. > > Hmm, well, you can actually try the PAT patch if you are feeling brave as it > maps all devices (including APICs) as uncacheable. Tried the updated PAT patch (with s/pmap_unmapbios/pmap_unmap_bios/ to get ACPI to compile). Unfortunately if it is a caching problem, PAT isn't able to fix it. Same result as stock kernel -- interrupts stop arriving after a dozen or so. AFAICT the local APIC is the only memory-mapped I/O region that seems to be problematic. Instead of writing the value twice, I also tried inserting an __asm("nop") before the write with no effect. Also, a single write to an unrelated area doesn't help: +static volatile int dummyeoi; + lapic_eoi(void) { + dummyeoi = 1; lapic->eoi = 0; + dummyeoi = 2; } I'm _reasonably_ certain that marking dummyeoi volatile and leaving it uninitialized will prevent gcc from optimizng that out. Forcing R/W cycles (++dummyeoi) before and after doesn't work either. A DELAY(1) before the lapic->eoi write does the trick, but DELAY does lots of complicated things so I don't know how useful of a data point that is. I'm probably missing something, but if bad cache behavior was causing writes to the lapic EOI register to not always take effect, wouldn't the _next_ irq (even if it's a different line) cause the one that's currently pending to be acknowledged? Craig