From owner-freebsd-stable Mon Apr 30 15:31: 6 2001 Delivered-To: freebsd-stable@freebsd.org Received: from smtp10.phx.gblx.net (smtp10.phx.gblx.net [206.165.6.140]) by hub.freebsd.org (Postfix) with ESMTP id 32A0F37B43C; Mon, 30 Apr 2001 15:31:00 -0700 (PDT) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp10.phx.gblx.net (8.9.3/8.9.3) id PAA23744; Mon, 30 Apr 2001 15:30:53 -0700 Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp10.phx.gblx.net, id smtpdLhwREa; Mon Apr 30 15:30:47 2001 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id PAA12638; Mon, 30 Apr 2001 15:31:39 -0700 (MST) From: Terry Lambert Message-Id: <200104302231.PAA12638@usr01.primenet.com> Subject: Re: Spontanous reboot of SMP system and FBSD 4.3 To: jhb@FreeBSD.ORG (John Baldwin) Date: Mon, 30 Apr 2001 22:31:34 +0000 (GMT) Cc: tlambert@primenet.com (Terry Lambert), freebsd-smp@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG, ohartman@klima.physik.uni-mainz.de ((Hartmann O.)) In-Reply-To: from "John Baldwin" at Apr 30, 2001 03:18:46 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > On some of the Tyan Tiger boards I've fought withm the system > > becomes extremely unstable because the invltlb() call after > > switching to 4M pages is not correctly communicated to the I/O > > APIC, and unless you have enough memory allocations to force all > > 8 4M entries out, it can lose its mind, relative to the TLB of > > one or more of the main CPUs. > > Ummm, what the heck are you talking about? The I/O APIC routes interrupts, it > doesn't access memory. I thought all APICs participated in the MESI coherency protocol, and all had the TLB caches? The system I fought with was "SMP capable", but did not have a second CPU installed. The only things that could get out of date would be the TLB contents in the cache lines. If I grab enough resources to cause age-based TLB shootdown, I don't have the problem. Otherwise I fault in the I/O path, when paging in. Supermicro motherboards don't have the problem. Setting the "DISABLE_PSE" option makes everything happy again; I suspect forcing a shootdown via a reload of CR3, or explicitly invalidating the GDT, would fix my problem. Right now, I just up my initial resource consumption and ignore it. Without this, I get similar symptoms to what he's reporting. Oh... he will probably want to turn on INVARIANTS, too, and then convert the printf() in kern/kern_malloc.c in malloc() to actually panic for "Data modified on freelist", instead of continuing to run to a cascade failure resulting in a spontaneous reboot... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message