From owner-freebsd-scsi Tue Sep 7 13: 2:21 1999 Delivered-To: freebsd-scsi@freebsd.org Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by hub.freebsd.org (Postfix) with ESMTP id 3424E14D35; Tue, 7 Sep 1999 13:02:13 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.1/8.9.1) with ESMTP id QAA28237; Tue, 7 Sep 1999 16:01:09 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.9.3/8.9.1) id QAA00933; Tue, 7 Sep 1999 16:00:38 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Tue, 7 Sep 1999 16:00:38 -0400 (EDT) To: scsi@freebsd.org Cc: gibbs@freebsd.org, anderson@cs.duke.edu Subject: data corruption when using aic7890 X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs Lucid Message-ID: <14293.26481.521753.519004@grasshopper.cs.duke.edu> Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hi, I have a bunch of ASUS P2B-LS motherboards with on-board AIC7890 U2 controllers. I'm running a kernel with rev 1.20 of src/sys/pci/ahc_pci.c (eg, after the CACHETHEN fix). When I run a local data-integrity checking program, I'm seeing occasional data corruption on Seagate ST39140W drives connected to the on-board U2 controller. This program writes a known pattern of data into a variable size (512MB in this case) file on disk & reads it back over & over again. If it encounters corruption, it reports just the first word where corruption was found & then skips to the next page, so its hard to tell how complete the corruption is. We see things like this: ##error 0 page 8228 expected [0x030241d8] saw [0x07c5b1d8] ##error 1 page 9718 expected [0x035f61f0] saw [0x072081f0] ##error 2 page 15719 expected [0x03d671c8] saw [0x016441c8] The last 3 bytes are the offset into the page. Since they are non-zero, at least part of the data is correct. It seems that the corruption only occurs after the first 400 or so bytes data in a page. It seems to be happening fairly infrequently (about every 500GB of data or so). Most importantly, it seems to be happenening only on drives connected to the on-board U2 interfaces, so my first guess would be that we can rule out anything but a driver or hardware problem. Eg, this machine has 2 more ST39140W drives connected to an ncr 53c875 & I've never seen any corruption on them. Ditto for the an IDE disk connected to the on-board ide controller. Any ideas? Thanks, Drew ------------------------------------------------------------------------------ Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: gallatin@cs.duke.edu Department of Computer Science Phone: (919) 660-6590 PS: Here's a dmesg of the box: Copyright (c) 1992-1999 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-CURRENT #0: Fri Sep 3 16:19:50 EDT 1999 gallatin@grasshopper.cs.duke.edu:/freebsd/src/sys/compile/SLICEX86 Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 300683643 Hz CPU: Pentium III (300.68-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x672 Stepping = 2 Features=0x387f9ff,MMX,FXSR,> real memory = 134205440 (131060K bytes) avail memory = 126316544 (123356K bytes) Preloaded elf kernel "kernel" at 0xc0325000. Pentium Pro MTRR support enabled npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 4.0 on pci0 isa0: on isab0 ata-pci0: at device 4.1 on pci0 ata-pci0: Busmastering DMA supported ata0 at 0x01f0 irq 14 on ata-pci0 chip1: at device 4.2 on pci0 intpm0: at device 4.3 on pci0 intpm0: I/O mapped e800 intpm0: intr IRQ 9 enabled revision 0 smbus0: on intsmb0 smb0: on smbus0 intpm0: PM I/O mapped e400 ahc0: irq 12 at device 6.0 on pci0 BRDCTL = 0x2 ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs fxp0: irq 10 at device 7.0 on pci0 fxp0: Ethernet address 00:e0:18:98:12:26 ncr0: irq 12 at device 9.0 on pci0 pci0: unknown card DGH8043 (vendor=0x10e8, dev=0x8043) at 12.0 irq 11 Probing for PnP devices: fdc0: at port 0x3f0-0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: at port 0x60-0x6f on isa0 atkbd0: irq 1 on atkbdc0 sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ata0: master: setting up UDMA2 mode on PIIX4 chip OK ad0: ATA-4 disk at ata0 as master ad0: 6149MB (12594960 sectors), 13328 cyls, 15 heads, 63 S/T, 512 B/S ad0: piomode=4, dmamode=2, udmamode=2 ad0: 16 secs/int, 0 depth queue, DMA mode da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da0: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da1 at ahc0 bus 0 target 2 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da1: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da3 at ncr0 bus 0 target 2 lun 0 da3: Fixed Direct Access SCSI-2 device da3: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da3: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da2 at ncr0 bus 0 target 0 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da2: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) changing root device to wd0s1a tpz0: irq 11 at device 12.0 on pci0 tpz0: Myrinet LANai 4.3 address 00:60:dd:7f:e7:53 (M2M-PCI32c-21956) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message