From owner-freebsd-hackers Thu Sep 28 12:57:51 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id MAA27660 for hackers-outgoing; Thu, 28 Sep 1995 12:57:51 -0700 Received: from aslan.cdrom.com (aslan.cdrom.com [192.216.223.142]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id MAA27654 for ; Thu, 28 Sep 1995 12:57:48 -0700 Received: from localhost.cdrom.com (localhost.cdrom.com [127.0.0.1]) by aslan.cdrom.com (8.6.12/8.6.9) with SMTP id MAA21578; Thu, 28 Sep 1995 12:56:16 -0700 Message-Id: <199509281956.MAA21578@aslan.cdrom.com> X-Authentication-Warning: aslan.cdrom.com: Host localhost.cdrom.com didn't use HELO protocol To: Mark Murray cc: "Justin T. Gibbs" , Andreas Klemm , hackers@freebsd.org Subject: Re: make world on FreeBSD-stable impossible. cc1: ... signal 11 In-reply-to: Your message of "Thu, 28 Sep 1995 21:30:24 +0200." <199509281930.VAA18103@grumble.grondar.za> Date: Thu, 28 Sep 1995 12:56:16 -0700 From: "Justin T. Gibbs" Sender: owner-hackers@freebsd.org Precedence: bulk >> >You are not alone. I have a 486DX4/100/PCI and an adaptec 2940, and I get >> >these the whole time, as well as the signal 11's. I just tried a reboot, >> >ant the files that were corrupted before are now OK???!! They are corrupted >> >in exactly that same sort of way you are reporting. >> > >> >My kernel is stable, the rest is a mix (a lot hand-installed). >> > >> >M >> >> The only known problem with the aic7xxx driver has to do with losing bytes >> (the transfer leaves a residual of 1-13 bytes). This only happens when >> the transfer is going faster then 10MB/s (like a wide cappella or atlas), >> so a narrow device shouldn't show this problem. Single bit errors are >> almost always ram or cache problems. The driver supports full parity >> checking on its data up to the point that it is transfered to host memory, >> so I don't think this is the driver's fault. > >I don't think you are right. I have slowed my Adaptec down to 5MB/s and I >get (got) 2 errors: > >1) Single bit errors - fixed by banging the box a bit an reseating the > chips (cache etc) > >2) Missing chunks of code - 1-13 looks about right. This is with an > 2940 and an HP scsi2 (narrow) disk. Replacing the 2940 with a 1542 > fixes the problem. The problem only occurs after a great deal of > activity (like a big system build), and is repeatable on the broken > file until a reboot. After the reboot the file is fine. If a sig11 > occurs, it is also repeatable until a reboot. Can you turn on the ahc debug code in the driver and see if you are getting residuals with this drive? #define AHC_DEBUG int ahc_debug = AHC_SHOWABORTS|AHC_SHOWMISC; If you get residuals reported during normal file I/O (the residuals during check sense retrieval and mounting partitions are normal), then it is the same bug. >M > >-- >Mark Murray >46 Harvey Rd, Claremont, Cape Town 7700, South Africa >+27 21 61-3768 GMT+0200 >Finger mark@grumble.grondar.za for PGP key -- Justin T. Gibbs =========================================== Software Developer - Walnut Creek CDROM FreeBSD: Turning PCs into workstations ===========================================