Date: Mon, 5 Feb 2001 22:26:49 -0800 From: "3Phase" <Phase3@worldnet.att.net> To: "Mark Ibell" <marki@paradise.net.nz> Cc: <freebsd-questions@FreeBSD.ORG> Subject: Re: SCSI parity error Message-ID: <04d601c09006$05377d20$4fa0480c@sisyphus2> References: <004301c08ff0$96e0c5d0$0101a8c0@evileye>
next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- From: "Mark Ibell" <marki@paradise.net.nz> To: <freebsd-questions@freebsd.org> Sent: Monday, February 05, 2001 07:55 PM Subject: SCSI parity error > Hi, > > We've just experienced a nasty server crash on a system running 4.1-RELEASE. > The drive configuration is 2 x Quantum Atlas 10k2 drives running off an > Adaptec 2940U2W controller. The relevant log entries are listed below. Any > ideas what could have caused this - both disks appear to check out ok > according to the SCSI BIOS 'Verify Media' option. > > Cheers, > Mark > > > (da1:ahc0:0:6:0): parity error detected in Data-in phase. SEQADDR(0x166) > SCSIRATE(0x93) > ahc0:A:6: unknown scsi bus phase 0. Attempting to continue > ahc0: WARNING no command for scb 0 (cmdcmplt) > QOUTPOS = 195 > ahc0: WARNING no command for scb 96 (cmdcmplt) > QOUTPOS = 196 > ... > ahc0: WARNING no command for scb 6 (cmdcmplt) > QOUTPOS = 219 > (da1:ahc0:0:6:0): SCB 0x13 - timed out while idle, SEQADDR == 0xb > (da1:ahc0:0:6:0): Queuing a BDR SCB > (da1:ahc0:0:6:0): Bus Device Reset Message Sent > (da1:ahc0:0:6:0): no longer in timeout, status = 34c > ahc0: Bus Device Reset on A:6. 1 SCBs aborted > (da0:ahc0:0:5:0): SCB 0x8c - timed out while idle, SEQADDR == 0xa > (da0:ahc0:0:5:0): Queuing a BDR SCB > (da0:ahc0:0:5:0): Bus Device Reset Message Sent > (da0:ahc0:0:5:0): no longer in timeout, status = 34b > ahc0: Bus Device Reset on A:5. 7 SCBs aborted > ... Parity usually means hardware. Are they 10k RPM drives? Are they separate or are you using them as a virtual volume? What was it doing when it crashed, loafing or heavy use? Cheap test: Get a radio, find a frequency and listen to the machine. Give the drives a repetative task and you should be able to 'hear' each sub-system operate when it reads/writes data. Walk away with the radio. If you can hear it down the hall it has RF problems. If it sounds 'different' sometimes you have a problem but error correction is masking it. Assuming it's been running okay for a while, check the usual suspects like loose connections, sockets, terminators, cables, heat, and good power. No one tripped over the cord or used it as a shin-detector? -3P To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?04d601c09006$05377d20$4fa0480c>