From owner-freebsd-questions Wed Mar 1 23:58:24 2000 Delivered-To: freebsd-questions@freebsd.org Received: from tinker.com (troll.tinker.com [204.214.7.146]) by hub.freebsd.org (Postfix) with ESMTP id 30BDB37C162 for ; Wed, 1 Mar 2000 23:58:20 -0800 (PST) (envelope-from kim@tinker.com) Received: by localhost (8.8.5/8.8.5) Received: by mail.tinker.com via smap (V2.0) id xma019967; Thu Mar 2 01:57:27 2000 Received: by localhost (8.8.8/8.8.8) id CAA16977 for ; Thu, 2 Mar 2000 02:01:38 -0600 (CST) Message-ID: <38BE1EED.7D877256@tinker.com> Date: Thu, 02 Mar 2000 01:57:33 -0600 From: Kim Shrier Organization: Shrier and Deihl X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 3.2-RELEASE i386) X-Accept-Language: en MIME-Version: 1.0 To: freebsd-questions@freebsd.org Subject: disk errors Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I am having some trouble with one of my SCSI disks and I am trying to figure out if the problem is the drive or the controller card. The system in question has crashed 4 times in the past year and it never logged anything suspicious until today. Right before the crash, these messages showed up in the log: Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in Data-Out phase. Tag == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 Mar 1 15:50:06 hrothgar /kernel: == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in Data-Out phase. Tag == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 The machine has an Adaptec 3940uw SCSI controller card and 3 Seagate ST34573W barracuda drives, 2 on the first channel and 1 on the second. The drive giving me the problem is the first drive on the first channel. What I am trying to figure out is if the problem is in the drive or the controller card. Following are the boot messages for the hardware in question: Mar 1 18:48:53 hrothgar /kernel: ahc0: rev 0x03 int a irq 11 on pci0.15.0 Mar 1 18:48:53 hrothgar /kernel: ahc0: aic7895 Wide Channel A, SCSI Id=7, 255 SCBs Mar 1 18:48:53 hrothgar /kernel: ahc1: rev 0x03 int b irq 12 on pci0.15.1 Mar 1 18:48:53 hrothgar /kernel: ahc1: aic7895 Wide Channel B, SCSI Id=7, 255 SCBs ... Mar 1 18:48:53 hrothgar /kernel: Waiting 5 seconds for SCSI devices to settle Mar 1 18:48:53 hrothgar /kernel: changing root device to da0s1a Mar 1 18:48:53 hrothgar /kernel: da2 at ahc1 bus 0 target 2 lun 0 Mar 1 18:48:53 hrothgar /kernel: da2: Fixed Direct Access SCSI-2 device Mar 1 18:48:53 hrothgar /kernel: da2: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled Mar 1 18:48:53 hrothgar /kernel: da2: 4340MB (8888924 512 byte sectors: 255H 63S/T 553C) Mar 1 18:48:53 hrothgar /kernel: da0 at ahc0 bus 0 target 0 lun 0 Mar 1 18:48:53 hrothgar /kernel: da0: Fixed Direct Access SCSI-2 device Mar 1 18:48:53 hrothgar /kernel: da0: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled Mar 1 18:48:53 hrothgar /kernel: da0: 4340MB (8888924 512 byte sectors: 255H 63S/T 553C) Mar 1 18:48:53 hrothgar /kernel: da1 at ahc0 bus 0 target 1 lun 0 Mar 1 18:48:53 hrothgar /kernel: da1: Fixed Direct Access SCSI-2 device Mar 1 18:48:53 hrothgar /kernel: da1: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled Mar 1 18:48:53 hrothgar /kernel: da1: 4340MB (8888924 512 byte sectors: 255H 6Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in Data-Out phase. Tag == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 Mar 1 15:50:06 hrothgar /kernel: == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 4096 Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): data overrun detected in Data-Out phase. Tag == 0x4e. Mar 1 15:50:06 hrothgar /kernel: (da0:ahc0:0:0:0): Have seen Data Phase. Length = 8192. NumSGs = 2. Mar 1 15:50:06 hrothgar /kernel: sg[0] - Addr 0x4926000 : Length 40963S/T 553C) Please send replies directly to me since I am not on this list. Kim Shrier -- Kim Shrier - principal, Shrier and Deihl - mailto:kim@tinker.com Remote Unix Network Admin, Security, Internet Software Development Tinker Internet Services - Superior FreeBSD-based Web Hosting http://www.tinker.com/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message