From owner-freebsd-questions@FreeBSD.ORG Mon Nov 26 10:58:21 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4482916A46C for ; Mon, 26 Nov 2007 10:58:21 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from bavaria.utcluj.ro (unknown [IPv6:2001:b30:5000:2:20e:cff:fe4b:ca01]) by mx1.freebsd.org (Postfix) with ESMTP id 7A4DC13C4D9 for ; Mon, 26 Nov 2007 10:58:20 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from localhost (localhost [127.0.0.1]) by bavaria.utcluj.ro (Postfix) with ESMTP id 2B32350888; Mon, 26 Nov 2007 12:58:19 +0200 (EET) X-Virus-Scanned: by the daemon playing with your mail on local.mail.utcluj.ro Received: from bavaria.utcluj.ro ([127.0.0.1]) by localhost (bavaria.utcluj.ro [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JdLN-frn8iAs; Mon, 26 Nov 2007 12:58:09 +0200 (EET) Received: from [172.27.2.200] (c7.campus.utcluj.ro [193.226.6.226]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by bavaria.utcluj.ro (Postfix) with ESMTP id AACA750844; Mon, 26 Nov 2007 12:58:09 +0200 (EET) Message-ID: <474AA6C1.6040908@net.utcluj.ro> Date: Mon, 26 Nov 2007 12:58:09 +0200 From: Cristian KLEIN User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Gert Lynge References: <20071122155640.fa7e0536.wmoran@potentialtech.com> <20071123092146.E0E1B2878E@smtp.proximedia.com> <14989d6e0711230545k4b32c55bs3564647043f9f4ed@mail.gmail.com> <20071123150500.GA85473@owl.midgard.homeip.net> <01e301c82dfa$f049c720$d0dd5560$@org> In-Reply-To: <01e301c82dfa$f049c720$d0dd5560$@org> X-Enigmail-Version: 0.95.5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-questions@freebsd.org Subject: Re: SV: RAID1 synchronisation - howto OR not necessary? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Nov 2007 10:58:21 -0000 Gert Lynge wrote: >> The disks themselves handle the checksumming to detect bad blocks. >> With modern disks it is *very* rare that a block on the disk goes bad >> without the disk being able to report it it as such. >> This means that if you have a functioning RAID1 setup and one of the >> disks report a bad block, then the controller can simply read the >> corresponding block from the other disk, and rewrite it to the disk >> with the bad block. If a disk has problems writing a block it will >> transparently re-map the block to another. >> The problems can occur when one disk in a RAID-array has failed and you >> try to rebuild it from the other disk(s). If you then encounter a bad block >> on that disk you have a problem since you don't have a good copy of that >> block. >> This is what verification (which, btw, is not the same as synchronization) >> tries to prevent by reading every block on each disk on a regular basis. >> Then the RAID controller can recover the data on any bad blocks from the >> other disk(s) in the array. > > I've been wondering how to do this with a BIOS assisted soft raid for some > time. > I have a server with ad4 ad6 in a mirror detected as ar0: > ---- > ws# atacontrol status ar0 > ar0: ATA RAID1 subdisks: ad4 ad6 status: READY > ---- > ws# cat /var/run/dmesg.boot > [...] > ar0: 76316MB status: READY > ar0: disk0 READY (master) using ad4 at ata2-master > ar0: disk1 READY (mirror) using ad6 at ata3-master > [...] > ---- > > ...and was wondering if dd could not do the job for me? > ---- > ws# man dd > [...] > EXAMPLES > Check that a disk drive contains no bad blocks: > dd if=/dev/ad0 of=/dev/null bs=1m > [...] > ---- > > What if I run: > dd if=/dev/ad4 /of=/dev/null bs=1m > dd if=/dev/ad6 /of=/dev/null bs=1m > > ...once a week - will that not verify that the two drives can read all > blocks? > > It would be nice to limit the load (the throughput of dd) though - anyone > know if that is possible? Maybe by pipeing through a second command (I guess > a throughput limiter could easily be programmed?). Hi, For achieving this, I use smartmontools and program smartd to regularly issue an „offline test” to the drive. I receive a mail if any bad sector is found. The good thing is that this verification happens in the drive itself and reading / writing from the drive will automatically suspend the test. This gives the feeling that the test is done without any performance penality. The bad thing is that this verification happens in the drive itself. If the drive has a faulty firmware[1], or if other errors (such as problems with IDE cables occur), these won't be detected. All in all, smartd + geom_mirror gives me more confidence that I won't lose data.