Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Feb 2013 21:14:35 -0000
From:      "Steven Hartland" <smh@freebsd.org>
To:        "John-Mark Gurney" <jmg@funkthat.com>, <freebsd-fs@FreeBSD.org>
Subject:   Re: ZFS on 9.1 doesn't see errors on geli volumes...
Message-ID:  <EF8E2A888F8D44FBA7362610150DAEA1@multiplay.co.uk>
References:  <20130218191242.GI55866@funkthat.com> <20130218200121.GJ55866@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.

------=_NextPart_000_0B02_01CE0E1C.F42EBD20
Content-Type: text/plain;
	format=flowed;
	charset="iso-8859-1";
	reply-type=original
Content-Transfer-Encoding: 7bit


----- Original Message ----- 
From: "John-Mark Gurney" <jmg@funkthat.com>
To: <freebsd-fs@FreeBSD.org>
Sent: Monday, February 18, 2013 8:01 PM
Subject: Re: ZFS on 9.1 doesn't see errors on geli volumes...


> John-Mark Gurney wrote this message on Mon, Feb 18, 2013 at 11:12 -0800:
>> I'm running 9.1:
>> FreeBSD gold.funkthat.com 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #26 r241041M: Wed Dec 12 23:02:31 PST 2012 
>> jmg@gold.funkthat.com:/usr/src.9stable/sys/amd64/compile/gold  amd64
>>
>> The modifications are limited to improving AES-NI performance.
>>
>> On a box, and decided to go full ZFS w/ geli encrypted volumes (including
>> root fs)...  One of the hard drives started going bad, so I started
>> seeing:
>> hpt27xx: Device error information 0x1000000
>> hpt27xx: Task file error, StatusReg=0x51, ErrReg=0x40, LBA[0-3]=0xf495e928,LBA[4-7]=0x0.
>> (da3:hpt27xx0:0:3:0): READ(10). CDB: 28 0 f4 95 e8 f8 0 0 80 0
>> (da3:hpt27xx0:0:3:0): CAM status: Auto-Sense Retrieval Failed
>> (da3:hpt27xx0:0:3:0): Error 5, Unretryable error
>> GEOM_ELI: g_eli_read_done() failed label/toby.eli[READ(offset=2100974186496, length=90112)]
>>
>> and:
>> (da3:hpt27xx0:0:3:0): WRITE(10). CDB: 2a 0 ef cc 10 90 0 0 8 0
>> (da3:hpt27xx0:0:3:0): CAM status: Auto-Sense Retrieval Failed
>> (da3:hpt27xx0:0:3:0): Error 5, Unretryable error
>> GEOM_ELI: Crypto WRITE request failed (error=5). label/toby.eli[WRITE(offset=2059841654784, length=4096)]
>>
>> So we can see that geli is failing, but zpool status command doesn't show
>> any errors at all...  The READ and WRITE columns both show 0 for the device..
>>
>> Now I do know that the WRITEs are not retried, because if I do a scrub
>> afterward, it detects cksum errors, and does properly increases the
>> count in the CKSUM column...
>>
>> Now if I pull a device, it will see that the device is lost, but no
>> matter how many read or write errors get returned by geli, zfs doesn't
>> seem to count them...
>>
>> Has anyone else seen this w/ ZFS?  Is it possible that it's a problem w/
>> geli, and not ZFS?
>>
>> I haven't tried to run a test w/ gnop to fail some read/writes on -current..
>>
>> P.S. Please keep me cc'd, as I'm not on the list.
>
> Well, after some digging w/ help from smh@, it looks like the write
> case in geli is broken...  in g_eli_write_done, we have the code:
>        if (pbp->bio_error != 0) {
>                G_ELI_LOGREQ(0, pbp, "Crypto WRITE request failed (error=%d).",
>                    pbp->bio_error);
>                pbp->bio_completed = 0;
>        }
>        /*
>         * Write is finished, send it up.
>         */
>        pbp->bio_completed = pbp->bio_length;
>        sc = pbp->bio_to->geom->softc;
>        g_io_deliver(pbp, pbp->bio_error);
>        atomic_subtract_int(&sc->sc_inflight, 1);
>
> so, we just end up overwriting the bio_completed error...
>
> pjd, should we just put the bio_completed = line under an else?
>
> something like:
> if (pbp->bio_error != 0) {
> G_ELI_LOGREQ(0, pbp, "Crypto WRITE request failed (error=%d).",
>     pbp->bio_error);
> pbp->bio_completed = 0;
> } else
> pbp->bio_completed = pbp->bio_length;
>
> /* Write is finished, send it up. */
> g_io_deliver(pbp, pbp->bio_error);
> sc = pbp->bio_to->geom->softc;
> atomic_subtract_int(&sc->sc_inflight, 1);
>
> But doesn't explain why read's aren't being counted though...

Looks like the read case will loose the error if its not the last
bio in sector group.

The attached should fix both cases.

A question for someone familiar with geom: why is bio_completed
not set to bio_length in the read success case? Is this correct
or is this another little bug?

On a related note, if anyone's got some pointers to docs about
the internals of geom, I'd be interested :)

    Regards
    Steve
    Regards
    Steve 

------=_NextPart_000_0B02_01CE0E1C.F42EBD20
Content-Type: application/octet-stream;
	name="g_eli-error-loss.patch"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="g_eli-error-loss.patch"

--- sys/geom/eli/g_eli.c.orig	2013-02-18 20:50:53.838663732 +0000=0A=
+++ sys/geom/eli/g_eli.c	2013-02-18 21:04:59.429602837 +0000=0A=
@@ -158,7 +158,7 @@=0A=
 =0A=
 	G_ELI_LOGREQ(2, bp, "Request done.");=0A=
 	pbp =3D bp->bio_parent;=0A=
-	if (pbp->bio_error =3D=3D 0)=0A=
+	if (pbp->bio_error =3D=3D 0 && bp->bio_error !=3D 0)=0A=
 		pbp->bio_error =3D bp->bio_error;=0A=
 	g_destroy_bio(bp);=0A=
 	/*=0A=
@@ -169,7 +169,8 @@=0A=
 		return;=0A=
 	sc =3D pbp->bio_to->geom->softc;=0A=
 	if (pbp->bio_error !=3D 0) {=0A=
-		G_ELI_LOGREQ(0, pbp, "%s() failed", __func__);=0A=
+		G_ELI_LOGREQ(0, pbp, "%s() failed (error=3D%d)", __func__,=0A=
+		    pbp->bio_error);=0A=
 		pbp->bio_completed =3D 0;=0A=
 		if (pbp->bio_driver2 !=3D NULL) {=0A=
 			free(pbp->bio_driver2, M_ELI);=0A=
@@ -198,10 +199,8 @@=0A=
 =0A=
 	G_ELI_LOGREQ(2, bp, "Request done.");=0A=
 	pbp =3D bp->bio_parent;=0A=
-	if (pbp->bio_error =3D=3D 0) {=0A=
-		if (bp->bio_error !=3D 0)=0A=
-			pbp->bio_error =3D bp->bio_error;=0A=
-	}=0A=
+	if (pbp->bio_error =3D=3D 0 && bp->bio_error !=3D 0)=0A=
+		pbp->bio_error =3D bp->bio_error;=0A=
 	g_destroy_bio(bp);=0A=
 	/*=0A=
 	 * Do we have all sectors already?=0A=
@@ -212,14 +211,15 @@=0A=
 	free(pbp->bio_driver2, M_ELI);=0A=
 	pbp->bio_driver2 =3D NULL;=0A=
 	if (pbp->bio_error !=3D 0) {=0A=
-		G_ELI_LOGREQ(0, pbp, "Crypto WRITE request failed (error=3D%d).",=0A=
+		G_ELI_LOGREQ(0, pbp, "%s() failed (error=3D%d)", __func__,=0A=
 		    pbp->bio_error);=0A=
 		pbp->bio_completed =3D 0;=0A=
-	}=0A=
+	} else=0A=
+		pbp->bio_completed =3D pbp->bio_length;=0A=
+=0A=
 	/*=0A=
 	 * Write is finished, send it up.=0A=
 	 */=0A=
-	pbp->bio_completed =3D pbp->bio_length;=0A=
 	sc =3D pbp->bio_to->geom->softc;=0A=
 	g_io_deliver(pbp, pbp->bio_error);=0A=
 	atomic_subtract_int(&sc->sc_inflight, 1);=0A=

------=_NextPart_000_0B02_01CE0E1C.F42EBD20--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EF8E2A888F8D44FBA7362610150DAEA1>