From owner-freebsd-questions@freebsd.org Wed Mar 25 08:18:53 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E708927A27F for ; Wed, 25 Mar 2020 08:18:51 +0000 (UTC) (envelope-from jacques+freebsd@foucry.net) Received: from mail.foucry.net (fournil.foucry.net [95.217.83.231]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48nLbz4Dllz4NPx for ; Wed, 25 Mar 2020 08:18:38 +0000 (UTC) (envelope-from jacques+freebsd@foucry.net) Received: from mithril.localdomain (dontpanic.foucry.net [80.67.176.134]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.foucry.net (Postfix) with ESMTPSA id C80493E7FA for ; Wed, 25 Mar 2020 08:18:16 +0000 (UTC) Received: from mithril.foucry.net (mithril.foucry.net [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mithril.localdomain (Postfix) with ESMTPS id EC38D307D2 for ; Wed, 25 Mar 2020 09:18:15 +0100 (CET) Date: Wed, 25 Mar 2020 09:18:14 +0100 From: Jacques Foucry To: freebsd-questions@freebsd.org Subject: Re: replace disk in zpool Message-ID: <20200325081814.GK35528@mithril.foucry.net> Mail-Followup-To: freebsd-questions@freebsd.org References: <18a94704-5411-3b44-a525-2ae50121a467@holgerdanske.com> <4a8d409e-ecac-77c8-3ad9-025aefdfb4ef@holgerdanske.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4a8d409e-ecac-77c8-3ad9-025aefdfb4ef@holgerdanske.com> X-Spam-Status: No, score=-0.10 X-Rspamd-Server: mail.foucry.net X-Spam-Score: -0.10 X-Rspamd-Queue-Id: 48nLbz4Dllz4NPx X-Spamd-Bar: - X-Spamd-Result: default: False [-1.21 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[foucry.net:s=dkim]; URIBL_BLOCKED(0.00)[foucry.net.multi.uribl.com,wdc.com.multi.uribl.com,illumos.org.multi.uribl.com]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.14)[-0.139,0]; RCVD_COUNT_THREE(0.00)[3]; NEURAL_HAM_MEDIUM(-0.59)[-0.587,0]; DKIM_TRACE(0.00)[foucry.net:+]; DMARC_POLICY_ALLOW(-0.50)[foucry.net,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.51)[ipnet: 95.217.0.0/16(4.13), asn: 24940(-1.55), country: DE(-0.02)]; ASN(0.00)[asn:24940, ipnet:95.217.0.0/16, country:DE]; TAGGED_FROM(0.00)[freebsd]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 08:18:53 -0000 Le mardi 24 mars 2020 à 16:47:10 (-0700), David Christensen à écrit: > On 2020-03-24 14:15, Lukasz wrote: > > Ohh… I forgot mention: > > it's 12.1-p3 > > > > # zpool status -v mypool > > pool: mypool > > state: DEGRADED > > status: One or more devices has experienced an error resulting in data > > corruption. Applications may be affected. > > action: Restore the file in question if possible. Otherwise restore the > > entire pool from backup. > > see: http://illumos.org/msg/ZFS-8000-8A > > scan: resilvered 180G in 0 days 16:00:55 with 2 errors on Sun Mar 22 > > 05:18:46 2020 > > config: > > > > NAME STATE READ WRITE CKSUM > > mypool DEGRADED 0 0 2 > > raidz1-0 DEGRADED 0 0 4 > > diskid/DISK-WD-WMC1F0521131 ONLINE 0 0 0 > > replacing-1 DEGRADED 0 0 0 > > 15838717335844820448 UNAVAIL 0 0 0 was /dev/diskid/DISK-WD-WCC130964640 > > diskid/DISK-K4JG5D2B ONLINE 0 0 0 > > ada6 ONLINE 0 0 0 > > ada1 ONLINE 0 0 0 > > diskid/DISK-WD-WCC130650055 ONLINE 0 0 0 > > > > errors: Permanent errors have been detected in the following files: > > mypool/XXXXXXXXXXXX > > > > Yes, I did exacly as you wrote - removed the failed drive, installed a replacement drive, and issued a 'zpool replace' command. > > I tried this way to: > > I disabled running services in that pool, unmounted and mounted it again. Even I exported/imported that pool. > > It has no readonly property. > > Of course I have a backup. > > > My guess is that resilvering is stuck because ZFS has encountered data > corruption. This could be caused by drive(s), cable(s), and/or data port(s) > (motherboard or expansion card). > > > What was the failure mode of the bad drive? Did you test it in any other > machines? > > > Are the any items of concern in the SMART reports for the current set of > drives? Please post anything that looks questionable. > > > Unplug and plug all of your drive power and data cables. Make sure they > seat well. If unsure about a data cable, replace it with a new, locking > cable. I have experienced too many problems with red SATA cables. Few, if > any, are marked with their rated speed (I did mark some StarTech SATA III > cables). So, I stocked up on various lengths and configurations of Cable > Matters SATA III cables. They are black, marked "6G", and have locking > connectors. Now, whenever I am in a system case, I replace most every red > SATA cable just to be safe. > > > I appears that you have Western Digital hard drives. Download Data > Lifeguard Diagnostic (DLG) for DOS, burn it to a USB flash drive, boot it, > and test all of your drives. Please post the results: > > https://support.wdc.com/downloads.aspx?p=2 If you permit an advice, ALWAYS (when it's possible) buy and use disks from different brand (mix seagate, WD, etc..) in order to avoid same series and same MTBF. I know this to late in this case, but keep this in mind. I know this will not help in this case, please excuse my intervention if it's inappropriate. -- Jacques Foucry