From owner-freebsd-questions@FreeBSD.ORG Mon Nov 8 18:38:29 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AFE371065674 for ; Mon, 8 Nov 2010 18:38:29 +0000 (UTC) (envelope-from rsmith@xs4all.nl) Received: from smtp-vbr7.xs4all.nl (smtp-vbr7.xs4all.nl [194.109.24.27]) by mx1.freebsd.org (Postfix) with ESMTP id 5CBF28FC1F for ; Mon, 8 Nov 2010 18:38:28 +0000 (UTC) Received: from slackbox.erewhon.net (slackbox.xs4all.nl [213.84.242.160]) by smtp-vbr7.xs4all.nl (8.13.8/8.13.8) with ESMTP id oA8IcMIi098820; Mon, 8 Nov 2010 19:38:22 +0100 (CET) (envelope-from rsmith@xs4all.nl) Received: by slackbox.erewhon.net (Postfix, from userid 1001) id 27AC1BAB7; Mon, 8 Nov 2010 19:38:22 +0100 (CET) Date: Mon, 8 Nov 2010 19:38:22 +0100 From: Roland Smith To: "Svein Skogen (Listmail account)" Message-ID: <20101108183821.GA48373@slackbox.erewhon.net> References: <20101106203016.GB13095@guilt.hydra> <20101106213836.GA77198@slackbox.erewhon.net> <4CD8194D.7080208@qeng-ho.org> <4CD82081.50309@stillbilde.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4Ckj6UjgE2iN1+kY" Content-Disposition: inline In-Reply-To: <4CD82081.50309@stillbilde.net> X-GPG-Fingerprint: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 X-GPG-Key: http://www.xs4all.nl/~rsmith/pubkey.txt X-GPG-Notice: If this message is not signed, don't assume I sent it! User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: by XS4ALL Virus Scanner Cc: freebsd-questions@freebsd.org Subject: Re: ZFS License and Future X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Nov 2010 18:38:29 -0000 --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 08, 2010 at 05:08:33PM +0100, Svein Skogen (Listmail account) w= rote: > >>> The GEOM_ELI class provides optional authentication/checksumming. See > >>> geli(8), > >>> especially the -a option. > >> im not sure on whether that you be a viable replacement, as it has to = be > >> a fairly good checksum to avoid clashes, whilst also being quick so it > >> doesnt adversly affect disk performance. Also what does it do if it > >> detects the checksum doesnt match etc? Personally I've never enabled the checksumming because, and I quote from geli(8), "This will reduce size of available storage and also reduce speed". > > Good point. Geli uses a crypto standard hash (HMAC/SHA256 is > > recommended) as it's all about authentication in the face of potentially > > malicious attack, and that's fairly expensive. ZFS by default uses the > > fletcher2 (=3D fletcher32) hash, which is simple and fast, as it's used= to > > make sure that hardware hasn't accidentally mangled your data. But with geli(8) one can choose between HMAC/MD5, HMAC/SHA1, HMAC/RIPEMD160, HMAC/SHA256, HMAC/SHA384 and HMAC/SHA512. With the recommeded HMAC/SHA256 you'll loose 11% of the provider's capacity. Presumably MD5 is fastest while SHA512 is the slowest, while MD5 has a higher chance of collisions. > But it's still not capable of true forward-error-correction. If we are > to embark upon creating a new solution, using something that is cheap > for "normal cases" but can still be used (albeit more expensively) for > error recovery would (imho) be better. Even if that means we get less > net storage out of the gross pool (it could perhaps be configurable?) I'm not sure what you mean by "true forward-error-correction". But if you w= ant to make _really sure_ that a spinning disk hasn't mangled the data you shou= ld: - Calculate a checksum of a data block in memory. - Write the data block to disk (with write caching disabled to make as sure= as possible that the data is on disk when the write finishes. That is a _hug= e_ performance penalty) - Read the data back from disk (and not from the cache!) and compare with t= he original checksum. - If the read checksum control fails, mark the block as bad and repeat at another location Personally I don't see how this is going to be fast without compromising on correctness. If you keep the disk write cache enabled to the best of my knowledge there is no way for the OS to know for sure that the data is actually on the plates, so the read-back and comparison stage might not mean anything.=20 And for SSDs we might need another type of filesystem entirely. Some concep= ts in UFS2 (like e.g. cylinder groups) pretty much useless on SSDs. Roland --=20 R.F.Smith http://www.xs4all.nl/~rsmith/ [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) --4Ckj6UjgE2iN1+kY Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iEYEARECAAYFAkzYQ50ACgkQEnfvsMMhpyVaJACgm1DeXKPLmNdB9y/L3ugOpy/7 9eEAn0JrXNcS4DFOYPyZsskGJjmuwCXz =6nau -----END PGP SIGNATURE----- --4Ckj6UjgE2iN1+kY--