Date: Wed, 13 Sep 2006 08:09:03 -0400 From: Jonathan Stewart <jonathan@kc8onw.net> To: freebsd-stable@FreeBSD.ORG, jonathan@kc8onw.net Subject: Re: Anyone??? (was Reproducible data corruption on 6.1-Stable) Message-ID: <4507F4DF.3000605@kc8onw.net> In-Reply-To: <200609130912.k8D9CaUP063256@lurza.secnetix.de> References: <200609130912.k8D9CaUP063256@lurza.secnetix.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Oliver Fromme wrote: > Jonathan Stewart <jonathan@kc8onw.net> wrote: > > I set up a new server recently and transferred all the information from > > my old server over. I tried to use unison to synchronize the backup of > > pictures I have taken and noticed that a large number of pictures where > > marked as changed on the server. After checking the pictures by hand I > > confirmed that many of the pictures on the server were corrupted. I > > attempted to use unison to update the files on the server with the > > correct local copies but it would fail on almost all the files with the > > message "destination updated during synchronization." > > > > It appears the corruption happens during the read process because when I > > recompare the files in a graphical diff tool between cache flushes the > > differences move around!?!?!? The differences also appear to be very > > small for the most part, single bytes scattered throughout the file. I > > really have no idea what is causing the problem and would like to pin it > > down so I can either replace hardware if it's bad or fix whatever the > > bug is. > > That very much sounds like bad RAM, or overclocked CPU > or bus. I assume you do not overclock, so I recommend > you replace your RAM modules and check if the symptoms > are gone. > > Also check your BIOS settings for the RAM timings. > Setting the timings to more conservative values might > already solve the problem. Thanks for the suggestions but I have tried lowering the clock rate on the processor and and the RAM speed with no luck whatsoever. I appear to have forgotten to mention that the problem appears no matter how I read the file, unison, md5, etc. 1 out of maybe 100 times it will read correctly. I have another drive that I use for the OS and I have done many buildworlds/kernels without problems on that drive as well as compiling some very large software packages. I'm wondering if a possible cause is the controller ignoring read errors from the hard drive but I would think more than the occasional single byte would be changed. I'm thinking about maybe trying to dd the file from the raw device in an attempt to see if the problem is occurring in the filesystem code or is lower level yet. Any suggestions on how to locate the file on the disk or how to isolate the problem better are welcome. I don't mind doing the work I just have no idea where to look/what to try next. Thanks, Jonathan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4507F4DF.3000605>