Date: Thu, 14 Sep 2006 22:27:10 -0400 From: Jonathan Stewart <jonathan@kc8onw.net> Cc: freebsd-stable@freebsd.org Subject: Re: Anyone??? (was Reproducible data corruption on 6.1-Stable) Message-ID: <450A0F7E.7020600@kc8onw.net> In-Reply-To: <492332980.20060914225258@rulez.sk> References: <450752F6.4050109@kc8onw.net> <492332980.20060914225258@rulez.sk>
next in thread | previous in thread | raw e-mail | index | archive | help
Daniel Gerzo wrote: > Hello Jonathan, > > Wednesday, September 13, 2006, 2:38:14 AM, you wrote: > >> I set up a new server recently and transferred all the information from >> my old server over. I tried to use unison to synchronize the backup of >> pictures I have taken and noticed that a large number of pictures where >> marked as changed on the server. After checking the pictures by hand I >> confirmed that many of the pictures on the server were corrupted. > >> It appears the corruption happens during the read process because when I >> recompare the files in a graphical diff tool between cache flushes the >> differences move around!?!?!? The differences also appear to be very >> small for the most part, single bytes scattered throughout the file. I >> really have no idea what is causing the problem and would like to pin it >> down so I can either replace hardware if it's bad or fix whatever the >> bug is. > >> CPU: AMD Athlon(tm) XP 3200+ (2090.16-MHz 686-class CPU) >> Origin = "AuthenticAMD" Id = 0x6a0 Stepping = 0 > > I saw very similar simptons on p4 3.2ghz. I was able to build world > without any problems and the overall stability of the machine was > completely good, but when I tried to install some ports, the md5 > sums didn't match the source and I was sure that they were all right. > > The following simple test demonstrates the problem I was hitting: > > root@[bigbang ~]# sha256 /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > SHA256 (/usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz) = b95ddf27bc0ffa379c9aa881ca39e92a7d79e0d08999b4dff6d7d9547ee2a72d > root@[bigbang ~]# sha256 /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > SHA256 (/usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz) = 71432841b3965b7ab2d83f0dc7c3049195ea4e9267a8dc2d825a8a0466982930 > root@[bigbang ~]# sha256 /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > SHA256 (/usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz) = 83e44f5301b3270e821850164c74d275f6721bed5d126480cf518a9fe5ca0d6c > root@[bigbang ~]# md5 < /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > bd8c2e593e1fa4b01fd98eaf016329bb > root@[bigbang ~]# md5 < /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > bd8c2e593e1fa4b01fd98eaf016329bb > root@[bigbang ~]# md5 < /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > b9342bb213393238dd37322d4e2ee3fe > root@[bigbang ~]# md5 < /usr/ports/distfiles/ruby/ruby-1.8.4.tar.gz > 88efa7977fd3febaa8d260e3d5f21917 > > The memtest didn't show any problems with RAM and we were unable to > clarify what is really going on. Then we managed to get the machine > replaced with the complete new hardware and the problem was gone. > Later, I was told that it is some kind of known bug in older p4's > bioses (and advised to update the bios which should have been fixed > in the meantime) but we were unable to find out any information about > the problem. Fortunately the colo company replaced the hardware with > no problems. So long so good and the box is running flawlessly. > I don't think it's quite the same as my problem as I have to use dd on a large file to flush the cache and force freebsd to go back to the disk before the checksum changes. At this point I think I need to further narrow down where the error is occurring but I don't know what to try next. I am 99.999% sure memory and cpu are not the problem but after that point I'm getting into driver and filesystem code testing which is a little overwhelming to just dive into. Jonathan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?450A0F7E.7020600>