From owner-freebsd-hackers Thu Mar 2 10:10:11 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from kronos.alcnet.com (kronos.alcnet.com [63.69.28.22]) by hub.freebsd.org (Postfix) with ESMTP id A86AF37C459 for ; Thu, 2 Mar 2000 10:10:04 -0800 (PST) (envelope-from kbyanc@posi.net) X-Provider: ALC Communications, Inc. http://www.alcnet.com/ Received: from localhost (kbyanc@localhost) by kronos.alcnet.com (8.9.3/8.9.3/antispam) with ESMTP id NAA95671; Thu, 2 Mar 2000 13:09:43 -0500 (EST) Date: Thu, 2 Mar 2000 13:09:43 -0500 (EST) From: Kelly Yancey X-Sender: kbyanc@kronos.alcnet.com To: Kim Shrier Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: disk errors In-Reply-To: <38BEA963.DDB0EEC8@tinker.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, 2 Mar 2000, Kim Shrier wrote: [snip common complaint about -questions lacking answers] > > I am having some trouble with one of my SCSI disks and I am trying to > figure out if the problem is the drive or the controller card. The > system in question has crashed 4 times in the past year and it never > logged anything suspicious until today. Right before the crash, these > messages showed up in the log: > > Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in > Data-Out phase. Tag == 0x4e. > Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. > Length = 8192. NumSGs = 2. > Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 > Mar 1 15:50:06 hrothgar /kernel: == 0x4e. > Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. > Length = 8192. NumSGs = 2. > Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 > Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in > Data-Out phase. Tag == 0x4e. > Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. > Length = 8192. NumSGs = 2. > Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 > > > > The machine has an Adaptec 3940uw SCSI controller card and 3 Seagate > ST34573W barracuda drives, 2 on the first channel and 1 on the second. > The drive giving me the problem is the first drive on the first > channel. What I am trying to figure out is if the problem is in the > drive or the controller card. Following are the boot messages for the > hardware in question: > > I used to see these exact same messages when drives overheated. Since you are only getting the errors on the one drive, check if it isn't as well ventalated as the others (or maybe it is on top of the stack of drives in your tower?). Kelly -- Kelly Yancey - kbyanc@posi.net - Richmond, VA Analyst / E-business Development, Bell Industries http://www.bellind.com/ Maintainer, BSD Driver Database http://www.posi.net/freebsd/drivers/ Coordinator, Team FreeBSD http://www.posi.net/freebsd/Team-FreeBSD/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message