Date: Wed, 03 Aug 2016 19:50:33 -0500 From: Brandon J. Wandersee <brandon.wandersee@gmail.com> To: "William A. Mahaffey III" <wam@hiwaay.net> Cc: freebsd-questions@freebsd.org Subject: Re: Ominous smartd messages .... Message-ID: <86eg658ihy.fsf@WorkBox.Home> In-Reply-To: <e622244a-7836-068d-0554-1898689531a4@hiwaay.net> References: <e5a65f8a-27a0-65e7-42db-28bef824e0c0@hiwaay.net> <020caa94-b329-d5a6-5bd4-bfcc575c039f@freebsd.org> <b20550ac-1637-07d4-20dc-7cdc3a5173a9@hiwaay.net> <4b35b969-606b-9084-5ce3-688eddfc5e70@FreeBSD.org> <e622244a-7836-068d-0554-1898689531a4@hiwaay.net>
next in thread | previous in thread | raw e-mail | index | archive | help
William A. Mahaffey III writes: > On 08/03/16 15:19, Matthew Seaman wrote: >> On 03/08/2016 20:13, William A. Mahaffey III wrote: >>> What does this mean ? >> That there's a bad spot on the disk, which may also mean that you've got >> a corrupted filesystem -- depends if the bad spot was in use by zfs or >> not. 'zpool scrub' should tell you if the filesystem is corrupted. > > Can I do that 'zpool scrub' live ? Ordinarily, yes. A scrub might lower performance a little while it's underway, but it's safe to use the system while you do it. However, depending on how much data you have on that pool, a scrub can take a long time to finish. A scrub of the measely ~1.8Tb on my pool takes the better part of five hours to complete. The risk I would worry about in this particular situation is whether leaving the system running long enough for a scrub to complete would result in more sectors on the disk failing, in an area already passed over by the scrub. If that happened, you'd wind up with more corrupted files (assuming there already are corrupted files in the first place due to a filesystem problem). Finding and fixing those would mean running another scrub, taking up twice the time. Ordinarily, then, I'd recommend running the scrub after replacing the disk. In this particular situation, if you want to try get out of this with absolutely no corrupted files, then if at all possible use `zfs send | zfs receive` to clone the existing pool to a new pool on another machine, and run the scrub there. The problem is that if you intend to recreate your current pool in a RAIDZ layout you'll need to back up your data, and if you back up your data using rsync (as you have been) and then restore it to the new pool using rsync, the checksums for the previously good files will be lost and the corrupted files will be given new checksums. ZFS won't realize they're corrupted. Bear in mind, though, that none of this is to say that any of your files currently are corrupted or will be corrupted. This is just a "best approach to worst case" as I see it. > I was/am already thinking along those lines, w/ 1 complication. I have > another box (NetBSD 6.1.5) w/ a RAID5 that I wound up building w/ > mis-aligned disk/RAID blocks in spite of a fair amount of effort to > avoid that. I/O writes are horrible, 15-20 MB/s. My understanding is > that RAIDZn is like RAID5 in many ways & that you always want 2^n+1 > (3,5,9, ...) drives in a RAID5 to mitigate those misalignments, > presumably in a RAIDZ also. Is that so w/ RAIDZ as well ? If so, I lose > more than a small amount of total storage, which is why I went as I did > when I built the box whenever that was. I don't have enough knowledge/experience with RAIDZ to answer your specific questions, but if nothing else you could still combine the disks into mirrored vdevs, which are more flexible than RAIDZ, but slightly less robust. You'd have a maximum of half the storage space and more redundancy than you do now (though significantly less redundancy than with a RAIDZ setup). -- :: Brandon J. Wandersee :: brandon.wandersee@gmail.com :: -------------------------------------------------- :: 'The best design is as little design as possible.' :: --- Dieter Rams ----------------------------------
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86eg658ihy.fsf>