From owner-freebsd-current Thu Jan 16 20: 9:34 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A952B37B401; Thu, 16 Jan 2003 20:09:30 -0800 (PST) Received: from blues.jpj.net (blues.jpj.net [208.210.80.156]) by mx1.FreeBSD.org (Postfix) with ESMTP id 50DA743F1E; Thu, 16 Jan 2003 20:09:26 -0800 (PST) (envelope-from trevor@jpj.net) Received: from blues.jpj.net (localhost.jpj.net [127.0.0.1]) by blues.jpj.net (8.12.3/8.12.3) with ESMTP id h0H49GV3063059; Thu, 16 Jan 2003 23:09:16 -0500 (EST) (envelope-from trevor@jpj.net) Received: from localhost (trevor@localhost) by blues.jpj.net (8.12.3/8.12.3/Submit) with ESMTP id h0H49G5F063056; Thu, 16 Jan 2003 23:09:16 -0500 (EST) X-Authentication-Warning: blues.jpj.net: trevor owned process doing -bs Date: Thu, 16 Jan 2003 23:09:16 -0500 (EST) From: Trevor Johnson To: current@FreeBSD.ORG, Subject: Re: unexpected machine check on 5.0 alpha In-Reply-To: <15910.50391.473362.53094@grasshopper.cs.duke.edu> Message-ID: <20030116212057.A60768-100000@blues.jpj.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Andrew Gallatin wrote: > No, that's a 660. (system machine check). > A 670 is much more likely to be bad ram, bad cache, bad CPU, etc. > Its not always overheating. It's looking like at least my troubles are not from FreeBSD, but from the hardware, probably the SCSI card. I tried "dd if=/dev/zero of=/dev/da3" and got a pair of 670 machine checks, shown below. After I pressed the reset button, the SRM said "I/O-detected PCI bus data parity error on IOD0" just after looking at the Symbios SCSI card to which the hard drives are attached (I had gotten this before, when I had tried replacing the Ethernet card). Then there was a 660 machine check, then the SRM crashed--. -- begin log -- (noperiph:sym1:0:-1:-1): SCSI BUS reset detected. sym1: unable to abort current chip operation. unexpected machine check: mces = 0x1 vector = 0x670 param = 0xfffffc0000004e10 pc = 0xfffffc0000642970 ra = 0xfffffc0000406f70 curproc = 0xfffffc001f169200 pid = 23, comm = intr: sym1 panic: machine check cpuid = 1; boot() called on cpu#1 syncing disks, buffers remaining... panic: bwrite: buffer is not busy??? cpuid = 1; boot() called on cpu#1 Uptime: 1h42m51s (noperiph:sym1:0:-1:-1): SCSI BUS reset detected. sym1: unable to abort current chip operation. unexpected machine check: mces = 0x1 vector = 0x670 param = 0xfffffc0000004e10 pc = 0xfffffc0000642970 ra = 0xfffffc0000406f70 curproc = 0xfffffc001f169200 pid = 23, comm = intr: sym1 panic: machine check cpuid = 1; boot() called on cpu#1 Uptime: 1h42m53s panic: bremfree: removing a buffer not on a queue cpuid = 1; boot() called on cpu#1 Uptime: 1h43m16s sym1: suspicious SCSI data while resetting the BUS. sym1: dp1,d15-8,dp0,d7-0,rst,req,ack,bsy,sel,atn,msg,c/d,i/o = 0x7ffffff, expecting 0x100 -- end log -- -- Trevor Johnson To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message