From owner-freebsd-stable@freebsd.org Sun May 5 03:06:57 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7CA0C15A0C82 for ; Sun, 5 May 2019 03:06:57 +0000 (UTC) (envelope-from michelle@sorbs.net) Received: from hades.sorbs.net (hades.sorbs.net [72.12.213.40]) by mx1.freebsd.org (Postfix) with ESMTP id B14F883810 for ; Sun, 5 May 2019 03:06:53 +0000 (UTC) (envelope-from michelle@sorbs.net) MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Received: from [10.10.0.230] (gate.mhix.org [203.206.128.220]) by hades.sorbs.net (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013)) with ESMTPSA id <0PR000AHDHAJX100@hades.sorbs.net> for freebsd-stable@freebsd.org; Sat, 04 May 2019 20:20:46 -0700 (PDT) Subject: Re: ZFS... From: Michelle Sullivan X-Mailer: iPad Mail (16A404) In-reply-to: Date: Sun, 05 May 2019 13:06:43 +1000 Cc: Pete French , FreeBSD Stable Content-transfer-encoding: quoted-printable Message-id: <28BE9C83-FA53-4856-9176-52A6CB113641@sorbs.net> References: <3d0f6436-f3d7-6fee-ed81-a24d44223f2f@netfence.it> <17B373DA-4AFC-4D25-B776-0D0DED98B320@sorbs.net> <70fac2fe3f23f85dd442d93ffea368e1@ultra-secure.de> <70C87D93-D1F9-458E-9723-19F9777E6F12@sorbs.net> <58DA896C-5312-47BC-8887-7680941A9AF2@sarenet.es> <62803130-9C40-4A98-B5A4-A2DFAC0FAD65@sorbs.net> <20190503125118.GA11226@neutralgood.org> <2A7B5457-371A-4014-8C1E-972BA2FD10DF@sorbs.net> <7b9ce013-e50c-7cfc-f5c1-c829855f8ee2@ingresso.co.uk> <0D6CF718-2D40-4457-ADAB-CC17B52124AA@sorbs.net> To: Chris X-Rspamd-Queue-Id: B14F883810 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of michelle@sorbs.net designates 72.12.213.40 as permitted sender) smtp.mailfrom=michelle@sorbs.net X-Spamd-Result: default: False [-2.56 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.982,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a:hades.sorbs.net]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[sorbs.net]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[battlestar.sorbs.net,anaconda.sorbs.net,ninja.sorbs.net,catapilla.sorbs.net,scorpion.sorbs.net,desperado.sorbs.net]; NEURAL_HAM_SHORT(-0.41)[-0.410,0]; RCVD_IN_DNSWL_NONE(0.00)[40.213.12.72.list.dnswl.org : 127.0.10.0]; SUBJ_ALL_CAPS(0.45)[6]; IP_SCORE(-0.40)[ip: (-1.02), ipnet: 72.12.192.0/19(-0.53), asn: 11114(-0.41), country: US(-0.06)]; FREEMAIL_TO(0.00)[gmail.com]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11114, ipnet:72.12.192.0/19, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 May 2019 03:06:57 -0000 Michelle Sullivan http://www.mhix.org/ Sent from my iPad > On 05 May 2019, at 05:36, Chris wrote: >=20 > Sorry t clarify, Michelle I do believe your tail of events, just I > meant that it reads like a tale as its so unusual. There are multiple separate instances of problems over 8 years, but the fina= l killer was without a doubt a catalog of disasters.. =20 >=20 > I also agree that there probably at this point of time should be more > zfs tools written for the few situations that do happen when things > get broken. This is my thought..though I am in agreement with the devs that a ZFS =E2=80= =9Cfsck=E2=80=9D is not the way to go. I think we (anyone using zfs) needs t= o have a =E2=80=9Csalvage what data you can to elsewhere=E2=80=9D type tool.= .. I am yet to explore the one written under windows that a dev sent me to s= ee if that works (only because of the logistics of getting a windows 7 image= on a USB drive that I can put into the server for recovery attempting.). If= it works a version for command line would be the real answer to my prayers (= and others I imagine.) >=20 > Although I still standby my opinion I consider ZFS a huge amount more > robust than UFS, UFS always felt like I only had to sneeze the wrong > way and I would get issues. There was even one occasion simply > installing the OS on its defaults, gave me corrupted data on UFS (9.0 > release had nasty UFS journalling bug which corrupted data without any > power cuts etc.). Which I find interesting in itself as I have a machine running 9.3 which sta= rted life as a 5.x (which tells you how old it is) and it=E2=80=99s still ru= nning on the same *compaq* raid5 with UFS on it... with the original drives,= with a hot spare that still hasn=E2=80=99t been used... and the only thing d= one to it hardware wise is I replaced the motherboard 12 months ago as it ju= st stopped POSTing and couldn=E2=80=99t work out what failed...never had a d= rive corruption barring the fscks following hard power issues... it went wit= h me from Brisbane to Canberra, back to Brisbane by back of car, then to Mal= ta, back from Malta and is still downstairs... it=E2=80=99s my primary MX s= erver and primary resolver for home and handles around 5k email per day.. >=20 > In future I suggest you use mirror if the data matters. I know it > costs more in capacity for redundancy but in todays era of large > drives its the only real sensible option. Now it is and it was on my list of things to start just before this happened= ... in fact I have already got 4*6T drives to copy everything off ready to r= ebuild the entire pool with 16*6T drives in a raid 10 like config... the pow= er/corruption beat me to it. >=20 > On the drive failures you have clearly been quite unlucky, and the > other stuff is unusual. >=20 Drive failure wise, I think my =E2=80=9Cluck=E2=80=9D has been normal... rem= ember this is an 8 year old system drives are only certified for 3 years... g= etting 5 years when 24x7 is not bad (especially considering its workload). T= he problem has always been how zfs copes, and this has been getting better o= vertime, but this metadata corruption is something I have seen similar befor= e and that is where I have a problem with it... (especially when zfs devs st= art making statements about how the system is always right and everything el= se is because of hardware and if you=E2=80=99re not running enterprise hardw= are you deserve what you get... then advocating installing it on laptops etc= ..!) > Best of luck Thanks, I=E2=80=99ll need it as my changes to the code did not allow the mou= nt though it did allow zdb to parse the drive... guess what I thought was th= ere in zdb is not the same code in the zfs module. Michelle >=20 >> On Sat, 4 May 2019 at 09:54, Pete French wrot= e: >>=20 >>=20 >>=20 >>> On 04/05/2019 01:05, Michelle Sullivan wrote: >>> New batteries are only $19 on eBay for most battery types... >>=20 >> Indeed, my problem is actual physical access to the machine, which I >> havent seen in ten years :-) I even have a relacement server sitting >> behind my desk which we never quite got around to installing. I think >> the next move it makes will be to the cloud though, so am not too worried= . >>=20 >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"= > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"