Date: Tue, 24 Mar 2020 16:47:10 -0700 From: David Christensen <dpchrist@holgerdanske.com> To: Lukasz <FreeBSD@chroot.pl> Cc: freebsd-questions@freebsd.org Subject: Re: replace disk in zpool Message-ID: <4a8d409e-ecac-77c8-3ad9-025aefdfb4ef@holgerdanske.com> In-Reply-To: <f6297dfe-e0c4-12ef-523c-1944a9c735ff@chroot.pl> References: <d329c84a-8777-1eca-787c-dad9e0eae752@chroot.pl> <18a94704-5411-3b44-a525-2ae50121a467@holgerdanske.com> <f6297dfe-e0c4-12ef-523c-1944a9c735ff@chroot.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-03-24 14:15, Lukasz wrote: > Ohh… I forgot mention: > it's 12.1-p3 > > # zpool status -v mypool > pool: mypool > state: DEGRADED > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://illumos.org/msg/ZFS-8000-8A > scan: resilvered 180G in 0 days 16:00:55 with 2 errors on Sun Mar 22 > 05:18:46 2020 > config: > > NAME STATE READ WRITE CKSUM > mypool DEGRADED 0 0 2 > raidz1-0 DEGRADED 0 0 4 > diskid/DISK-WD-WMC1F0521131 ONLINE 0 0 0 > replacing-1 DEGRADED 0 0 0 > 15838717335844820448 UNAVAIL 0 0 0 was /dev/diskid/DISK-WD-WCC130964640 > diskid/DISK-K4JG5D2B ONLINE 0 0 0 > ada6 ONLINE 0 0 0 > ada1 ONLINE 0 0 0 > diskid/DISK-WD-WCC130650055 ONLINE 0 0 0 > > errors: Permanent errors have been detected in the following files: > mypool/XXXXXXXXXXXX > > Yes, I did exacly as you wrote - removed the failed drive, installed a replacement drive, and issued a 'zpool replace' command. > I tried this way to: > I disabled running services in that pool, unmounted and mounted it again. Even I exported/imported that pool. > It has no readonly property. > Of course I have a backup. My guess is that resilvering is stuck because ZFS has encountered data corruption. This could be caused by drive(s), cable(s), and/or data port(s) (motherboard or expansion card). What was the failure mode of the bad drive? Did you test it in any other machines? Are the any items of concern in the SMART reports for the current set of drives? Please post anything that looks questionable. Unplug and plug all of your drive power and data cables. Make sure they seat well. If unsure about a data cable, replace it with a new, locking cable. I have experienced too many problems with red SATA cables. Few, if any, are marked with their rated speed (I did mark some StarTech SATA III cables). So, I stocked up on various lengths and configurations of Cable Matters SATA III cables. They are black, marked "6G", and have locking connectors. Now, whenever I am in a system case, I replace most every red SATA cable just to be safe. I appears that you have Western Digital hard drives. Download Data Lifeguard Diagnostic (DLG) for DOS, burn it to a USB flash drive, boot it, and test all of your drives. Please post the results: https://support.wdc.com/downloads.aspx?p=2 David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4a8d409e-ecac-77c8-3ad9-025aefdfb4ef>