From owner-freebsd-questions@freebsd.org Tue Mar 24 23:47:41 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4DBA126E42C for ; Tue, 24 Mar 2020 23:47:41 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from holgerdanske.com (holgerdanske.com [IPv6:2001:470:0:19b::b869:801b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "xray.he.net", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48n7G936m5z4PvN for ; Tue, 24 Mar 2020 23:47:28 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from 99.100.19.101 ([99.100.19.101]) by holgerdanske.com with ESMTPSA (ECDHE-RSA-AES128-GCM-SHA256:TLSv1.2:Kx=ECDH:Au=RSA:Enc=AESGCM(128):Mac=AEAD) (SMTP-AUTH username dpchrist@holgerdanske.com, mechanism PLAIN) for ; Tue, 24 Mar 2020 16:47:10 -0700 Subject: Re: replace disk in zpool To: Lukasz Cc: freebsd-questions@freebsd.org References: <18a94704-5411-3b44-a525-2ae50121a467@holgerdanske.com> From: David Christensen Message-ID: <4a8d409e-ecac-77c8-3ad9-025aefdfb4ef@holgerdanske.com> Date: Tue, 24 Mar 2020 16:47:10 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 48n7G936m5z4PvN X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of dpchrist@holgerdanske.com has no SPF policy when checking 2001:470:0:19b::b869:801b) smtp.mailfrom=dpchrist@holgerdanske.com X-Spamd-Result: default: False [-2.74 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; URIBL_BLOCKED(0.00)[wdc.com.multi.uribl.com,illumos.org.multi.uribl.com]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.985,0]; IP_SCORE(-1.66)[ipnet: 2001:470::/32(-4.66), asn: 6939(-3.59), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; DMARC_NA(0.00)[holgerdanske.com]; AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-1.00)[-0.997,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Mar 2020 23:47:41 -0000 On 2020-03-24 14:15, Lukasz wrote: > Ohh… I forgot mention: > it's 12.1-p3 > > # zpool status -v mypool > pool: mypool > state: DEGRADED > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://illumos.org/msg/ZFS-8000-8A > scan: resilvered 180G in 0 days 16:00:55 with 2 errors on Sun Mar 22 > 05:18:46 2020 > config: > > NAME STATE READ WRITE CKSUM > mypool DEGRADED 0 0 2 > raidz1-0 DEGRADED 0 0 4 > diskid/DISK-WD-WMC1F0521131 ONLINE 0 0 0 > replacing-1 DEGRADED 0 0 0 > 15838717335844820448 UNAVAIL 0 0 0 was /dev/diskid/DISK-WD-WCC130964640 > diskid/DISK-K4JG5D2B ONLINE 0 0 0 > ada6 ONLINE 0 0 0 > ada1 ONLINE 0 0 0 > diskid/DISK-WD-WCC130650055 ONLINE 0 0 0 > > errors: Permanent errors have been detected in the following files: > mypool/XXXXXXXXXXXX > > Yes, I did exacly as you wrote - removed the failed drive, installed a replacement drive, and issued a 'zpool replace' command. > I tried this way to: > I disabled running services in that pool, unmounted and mounted it again. Even I exported/imported that pool. > It has no readonly property. > Of course I have a backup. My guess is that resilvering is stuck because ZFS has encountered data corruption. This could be caused by drive(s), cable(s), and/or data port(s) (motherboard or expansion card). What was the failure mode of the bad drive? Did you test it in any other machines? Are the any items of concern in the SMART reports for the current set of drives? Please post anything that looks questionable. Unplug and plug all of your drive power and data cables. Make sure they seat well. If unsure about a data cable, replace it with a new, locking cable. I have experienced too many problems with red SATA cables. Few, if any, are marked with their rated speed (I did mark some StarTech SATA III cables). So, I stocked up on various lengths and configurations of Cable Matters SATA III cables. They are black, marked "6G", and have locking connectors. Now, whenever I am in a system case, I replace most every red SATA cable just to be safe. I appears that you have Western Digital hard drives. Download Data Lifeguard Diagnostic (DLG) for DOS, burn it to a USB flash drive, boot it, and test all of your drives. Please post the results: https://support.wdc.com/downloads.aspx?p=2 David