From owner-freebsd-stable@freebsd.org Wed May 8 11:29:26 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4B34F15A77BA for ; Wed, 8 May 2019 11:29:26 +0000 (UTC) (envelope-from michelle@sorbs.net) Received: from hades.sorbs.net (hades.sorbs.net [72.12.213.40]) by mx1.freebsd.org (Postfix) with ESMTP id 1025F6EBDE for ; Wed, 8 May 2019 11:29:23 +0000 (UTC) (envelope-from michelle@sorbs.net) MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-type: text/plain; charset=UTF-8; format=flowed Received: from isux.com (gate.mhix.org [203.206.128.220]) by hades.sorbs.net (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013)) with ESMTPSA id <0PR6001WKOK7ZI20@hades.sorbs.net> for freebsd-stable@freebsd.org; Wed, 08 May 2019 04:43:22 -0700 (PDT) Subject: Re: ZFS... To: Borja Marcos , Walter Parker Cc: freebsd-stable@freebsd.org References: <30506b3d-64fb-b327-94ae-d9da522f3a48@sorbs.net> <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <5ED8BADE-7B2C-4B73-93BC-70739911C5E3@sorbs.net> <2e4941bf-999a-7f16-f4fe-1a520f2187c0@sorbs.net> <20190430102024.E84286@mulder.mintsol.com> <41FA461B-40AE-4D34-B280-214B5C5868B5@punkt.de> <20190506080804.Y87441@mulder.mintsol.com> <08E46EBF-154F-4670-B411-482DCE6F395D@sorbs.net> <33D7EFC4-5C15-4FE0-970B-E6034EF80BEF@gromit.dlib.vt.edu> From: Michelle Sullivan Message-id: <6d3274c5-130f-8398-f272-af01d9551448@sorbs.net> Date: Wed, 08 May 2019 21:29:18 +1000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:51.0) Gecko/20100101 Firefox/51.0 SeaMonkey/2.48 In-reply-to: X-Rspamd-Queue-Id: 1025F6EBDE X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of michelle@sorbs.net designates 72.12.213.40 as permitted sender) smtp.mailfrom=michelle@sorbs.net X-Spamd-Result: default: False [-2.22 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.975,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a:hades.sorbs.net]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[sorbs.net]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: battlestar.sorbs.net]; NEURAL_HAM_SHORT(-0.63)[-0.626,0]; RCVD_IN_DNSWL_NONE(0.00)[40.213.12.72.list.dnswl.org : 127.0.10.0]; SUBJ_ALL_CAPS(0.45)[6]; IP_SCORE(-0.36)[ip: (-0.89), ipnet: 72.12.192.0/19(-0.46), asn: 11114(-0.36), country: US(-0.06)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11114, ipnet:72.12.192.0/19, country:US]; MID_RHS_MATCH_FROM(0.00)[]; CTE_CASE(0.50)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 May 2019 11:29:26 -0000 Borja Marcos via freebsd-stable wrote: > >> On 8 May 2019, at 05:09, Walter Parker wrote: >> Would a disk rescue program for ZFS be a good idea? Sure. Should the lack >> of a disk recovery program stop you from using ZFS? No. If you think so, I >> suggest that you have your data integrity priorities in the wrong order >> (focusing on small, rare events rather than the common base case). > ZFS is certainly different from other flesystems. Its self healing capabilities help it survive problems > that would destroy others. But if you reach a level of damage past that “tolerable” threshold consider > yourself dead. bingo. > > Is it possible at all to write an effective repair tool? It would be really complicated. which is why I don't think a 'repair tool' is the correct way to go.. I get the ZFS devs saying 'no' to it, I really do. A tool to scan and salvage (if possible) the data on it is what it needs I think... copy off, rebuild the structure (reformat) and copy back. This tool is what I was pointed at: https://www.klennet.com/zfs-recovery/default.aspx ... no idea if it works yet.. but if it does what it says it does it is the 'missing link' I'm looking for... just I am having issues getting Windows 7 with SP1 on a USB stick to get .net 4.5 on it to run the software... :/ (only been at it 2 days though, so time yet.) > > By the way, ddrescue can help in a multiple drive failure scenery with ZFS. Been there done that - that's how I rescued it when it was damaged in shipping.. though I think I used 'recoverdisk' rather than ddrescue ... pretty much the same thing if not the same code. sector copied all three dead drives to new drives, put the three dead back in, brought them back online and then let it resilver... the data was recovered intact and not reporting any permanent errors. > If some of the drives are > showing the typical problem of “flaky” sectors with a lot of retries slowing down the whole pool you can > shut down the system or at least export the pool, copy the required drive/s to fresh ones, replace the > flaky drives and try to import the pool. I would first do the experiment to make sure it’s harmless, > but ZFS relies on labels written on the disks to import a pool regardless of disk controller topology, > devices names, uuids, or whatever. So a full disk copy should work. Don't need to test it... been there done that - it works. > > Michelle, were you doing periodic scrubs? I’m not sure you mentioned it. > > Yes though once a month as it took 2 weeks to complete. Michelle -- Michelle Sullivan http://www.mhix.org/