From owner-aic7xxx Thu Dec 17 00:34:31 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id AAA23945 for aic7xxx-outgoing; Thu, 17 Dec 1998 00:34:31 -0800 (PST) (envelope-from owner-aic7xxx@FreeBSD.ORG) Received: from ukaea.org.uk (gateway.ukaea.org.uk [194.128.63.74]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id AAA23937 for ; Thu, 17 Dec 1998 00:34:29 -0800 (PST) (envelope-from nconway.list@ukaea.org.uk) Received: by gateway.ukaea.org.uk id <66308>; Thu, 17 Dec 1998 08:31:32 +0000 Message-Id: <98Dec17.083132gmt.66308@gateway.ukaea.org.uk> Date: Thu, 17 Dec 1998 08:32:58 +0000 From: Neil Conway Organization: Fusion X-Mailer: Mozilla 4.06 [en] (X11; I; Linux 2.0.35 i686) MIME-Version: 1.0 To: Doug Ledford CC: Stephane Bortzmeyer , aic7xxx@FreeBSD.ORG Subject: Re: File corruption: how to find the guilty? References: <199812161347.OAA02367@ludwigV.sources.org> <3677BCBE.46D77E3B@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-aic7xxx@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Doug Ledford wrote: > > Stephane Bortzmeyer wrote: > > > > I have a Linux box which shows random corruption of files. Example: all Perl > > scripts suddenly die with "segmentation fault". Reinstalling the same Perl > > package cures it. Two days ago, /etc/resolv.conf became corrupted : strange > > characters were in it. > > > > I wonder what to do? Change the disk? The SCSI controller? The kernel? > > > > I run Linux 2.0.35 (Debian distribution 2.0), patched for the Adaptec driver > > 5.1.2. Here is the configuration: > > It's memory corruption. I've seen this float through this list or that about > 30 different times in the past. Not once has it ever been a kernel or driver > issue. In *every* case it has been either RAM, cache, or CPU. Check the CPU > fan, check the cache (if it isn't part of the CPU) and check your RAM. Well perhaps with a stable kernel this is the most likely culprit. However, it's dangerous to make blanket assertions - they come back to haunt you. Alan Cox was telling me last month about how 2.1.129 was causing him random memory corruption leading to disk corruption, and this turned out to be a kernel bug (nfs-related I think). Neil To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe aic7xxx" in the body of the message