From owner-freebsd-alpha Sun Oct 24 12:47:10 1999 Delivered-To: freebsd-alpha@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id 3CD7714BD7 for ; Sun, 24 Oct 1999 12:47:06 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id PAA01697; Sun, 24 Oct 1999 15:47:05 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.9.3/8.9.1) id PAA13411; Sun, 24 Oct 1999 15:46:35 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Sun, 24 Oct 1999 15:46:35 -0400 (EDT) To: Aernoudt Bottemanne Cc: "freebsd-alpha@freebsd.org" , marcel@scc.nl Subject: Re: buildworld problem + received processor correctable error message on PWS433au In-Reply-To: <3813170B.5A22F06F@capitolonline.nl> References: <3813170B.5A22F06F@capitolonline.nl> X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14355.24089.447439.7651@grasshopper.cs.duke.edu> Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Aernoudt Bottemanne writes: > Hi, > > > Now that the irq problwm is fixed, I tried to build a new world: > (make -j 32 buildworld > make.out 2>&1 ) > > It starts the job, but somewhere along the way it stops. The machine > does > not hang, I can login on other consoles etc, but the make process does > not > continue. During compilation I get these messages: > "received processor correctable errors" From the mailinglist archives I > found > that Andrew already mentioned them before, as a Hardware problem with > ECC memory (eg ECC memory being cnot in a well shape) This is probably a red herring. > In the make.out however there is no error message on th last line, in > order to > indicate what the problem could be. Did the make/cc/cpp callchain die? Is it a zombie? If it is not dead, what state is it in? Are there any jobs with a WCHAN of obtrm? (to see use ps axl or break into the debugger & do a ps if ps doesn't work because of a kernel/userland mismatch). Try running your buildworld with 'make buildworld' & avoid using any -j args. Something in the vm system is not using the atomic macros to change object state & there is a chance that under extreme load, jobs will hang in objtrm. BTW, the things you've highlighted in your dmesg (the nfs stuff) is also a red herring. => cia0: Pyxis, pass 1 => cia0: extended capabilities: 1 => cia0: WARNING: Pyxis pass 1 DMA bug; no bets... Read this as "Don't use this machine as a high volume server". ;-) The first generation pyxis (the chipset in your machine) has several problems. Formost is that PCI DMA reads that cross a page boundary don't work right. This is not a problem for the 32 bit slots because PCI-PCI bridge breaks transfers and prevents this from occuring. The firmware prevents you from putting "unknown" cards in the 64-bit slots. You can override this by doing 'set pci_device_override ' at the srm console prompt. Its other problems include piss-poor DMA performance for DMA reads and a tendancy to lock solid when the PCI bus is pushed hard. Although they sound bad, none of these things should affect its use as a personal workstation. In fact, I wish I had one at home ;-) => struct nfssvc_sock bloated (> 256bytes) => Try reducing NFS_UIDHASHSIZ => struct nfsuid bloated (> 128bytes) => Try unionizing the nu_nickname and nu_flag fields This is a red herring. ------------------------------------------------------------------------------ Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: gallatin@cs.duke.edu Department of Computer Science Phone: (919) 660-6590 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message