Date: Tue, 5 Jul 2005 20:24:41 -0300 (ADT) From: "Marc G. Fournier" <scrappy@hub.org> To: freebsd-stable@freebsd.org Subject: FreeBSD 4.x - SATA problems ... ? Message-ID: <20050705195656.B940@ganymede.hub.org>
next in thread | raw e-mail | index | archive | help
Recently, I added a new server to our network, using the 3Ware RAID controller (the 9500S-4LP card) and 3x140G SATA drives ... overall, the system works, but I'm getting a very odd behaviour that I've never seen before ... I have a process that run an rsync from another server to 'duplicate' the VPSs ... a 'live backup' sort of thing ... this is running on all our servers, without incident, *except*, it appears, the SATA server ... I had disabled it for a time, and just re-enabled it this morning, and somehow or another, it seems to be causing file system corruption ... As most 'old timers' here know, we use UNIONFS on all our servers ... when the corruption occurs, it looks like the "directory structures" are being changed ... this one is hard to explain :( For example, /usr/local/cyrus/bin has a bunch of binaries in it ... the binaries are kept on the "lower layer", so the upper layer only has a /usr/local/cyrus/bin directory created/ghosted, but no copies of the binaries ... so, when you are in the VPS, and do an ls of that directory, you see: # ls /usr/local/cyrus/bin arbitron cyr_expire lmtpd notifyd smmapd chk_cyrus cyrdump masssievec pop3d squatter ctl_cyrusdb deliver master pop3proxyd timsieved ctl_deliver fud mbexamine quota tls_prune ctl_mboxlist imapd mbpath reconstruct cvt_cyrusdb ipurge mkimap sievec When the 'corruption' happens, those all disappear, almost as if someone did a 'rm -rf' of the directory within the VPS, and then a 'mkdir' ... except that, from what I've been able to tell, this only happens randomly, it happens on any of the VPSs *and* only around the time that the rsync process is running ... As if, somehow, the rsync is taxing the system and causing bad writes ... but I can't find anything anywhere to indicate a problem ... To "fix" things, I umount the UNIONFS layer, and then do a 'find / cpio' to copy the "top layer" back over to fix the directory structure itself ... The thing is, I don't even know *where* to begin debugging this issue, since there aren't any errors being reported anywhere ... but maybe someone out there has an idea? thanks ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050705195656.B940>