From owner-freebsd-scsi Wed Sep 8 4:14:43 1999 Delivered-To: freebsd-scsi@freebsd.org Received: from arjun.niksun.com (gw.niksun.com [206.20.52.122]) by hub.freebsd.org (Postfix) with ESMTP id 6063E15B88 for ; Wed, 8 Sep 1999 04:14:29 -0700 (PDT) (envelope-from ath@niksun.com) Received: from stiegl.niksun.com (stiegl.niksun.com [10.0.0.44]) by arjun.niksun.com (8.8.8/8.8.8) with ESMTP id HAA15711; Wed, 8 Sep 1999 07:12:43 -0400 (EDT) Received: (from ath@localhost) by stiegl.niksun.com (8.9.2/8.8.7) id HAA68752; Wed, 8 Sep 1999 07:12:43 -0400 (EDT) (envelope-from ath) To: Andrew Gallatin Subject: Re: data corruption when using aic7890 Cc: freebsd-scsi@freebsd.org References: <14293.26481.521753.519004@grasshopper.cs.duke.edu> From: Andrew Heybey Date: 08 Sep 1999 07:12:42 -0400 In-Reply-To: Andrew Gallatin's message of "Tue, 7 Sep 1999 16:00:38 -0400 (EDT)" Message-ID: <85g10pbqs5.fsf@stiegl.niksun.com> Lines: 49 X-Mailer: Gnus v5.5/XEmacs 20.4 - "Emerald" Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Andrew Gallatin writes: > Hi, > > I have a bunch of ASUS P2B-LS motherboards with on-board AIC7890 U2 > controllers. I'm running a kernel with rev 1.20 of > src/sys/pci/ahc_pci.c (eg, after the CACHETHEN fix). > > When I run a local data-integrity checking program, I'm seeing > occasional data corruption on Seagate ST39140W drives connected to the > on-board U2 controller. This program writes a known pattern of data > into a variable size (512MB in this case) file on disk & reads it back > over & over again. If it encounters corruption, it reports just the > first word where corruption was found & then skips to the next page, > so its hard to tell how complete the corruption is. We see things > like this: > > ##error 0 page 8228 expected [0x030241d8] saw [0x07c5b1d8] > ##error 1 page 9718 expected [0x035f61f0] saw [0x072081f0] > ##error 2 page 15719 expected [0x03d671c8] saw [0x016441c8] > > The last 3 bytes are the offset into the page. Since they are > non-zero, at least part of the data is correct. It seems that the > corruption only occurs after the first 400 or so bytes data in a page. > It seems to be happening fairly infrequently (about every 500GB of > data or so). > > Most importantly, it seems to be happenening only on drives connected > to the on-board U2 interfaces, so my first guess would be that we can > rule out anything but a driver or hardware problem. Eg, this machine > has 2 more ST39140W drives connected to an ncr 53c875 & I've never > seen any corruption on them. Ditto for the an IDE disk connected to > the on-board ide controller. This sounds vaguely similar to kern/10243, except that I always saw corruption at the *end* of a page. How much data is corrupt? Is the bad data recognizable as being from elsewhere in the file? Try fiddling with the PCI bus latency setting in the bios (increasing it). However, the only sure solution that I found to my problem was to put the disks on the regular Ultra connector and live with 40MB/s. I have seen the problem with both IBM DRVS drives and Seagate ST39102LW. I have also seen the problem on the P2B-LS and a Supermicro motherboard with on-board AIC7890. andrew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message