From owner-freebsd-fs@FreeBSD.ORG Mon Feb 18 19:12:49 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EE689677 for ; Mon, 18 Feb 2013 19:12:49 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) by mx1.freebsd.org (Postfix) with ESMTP id ACD39C9 for ; Mon, 18 Feb 2013 19:12:49 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r1IJChGE024579 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 18 Feb 2013 11:12:43 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id r1IJChgf024578 for freebsd-fs@FreeBSD.org; Mon, 18 Feb 2013 11:12:43 -0800 (PST) (envelope-from jmg) Date: Mon, 18 Feb 2013 11:12:42 -0800 From: John-Mark Gurney To: freebsd-fs@FreeBSD.org Subject: ZFS on 9.1 doesn't see errors on geli volumes... Message-ID: <20130218191242.GI55866@funkthat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Mon, 18 Feb 2013 11:12:43 -0800 (PST) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Feb 2013 19:12:50 -0000 I'm running 9.1: FreeBSD gold.funkthat.com 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #26 r241041M: Wed Dec 12 23:02:31 PST 2012 jmg@gold.funkthat.com:/usr/src.9stable/sys/amd64/compile/gold amd64 The modifications are limited to improving AES-NI performance. On a box, and decided to go full ZFS w/ geli encrypted volumes (including root fs)... One of the hard drives started going bad, so I started seeing: hpt27xx: Device error information 0x1000000 hpt27xx: Task file error, StatusReg=0x51, ErrReg=0x40, LBA[0-3]=0xf495e928,LBA[4-7]=0x0. (da3:hpt27xx0:0:3:0): READ(10). CDB: 28 0 f4 95 e8 f8 0 0 80 0 (da3:hpt27xx0:0:3:0): CAM status: Auto-Sense Retrieval Failed (da3:hpt27xx0:0:3:0): Error 5, Unretryable error GEOM_ELI: g_eli_read_done() failed label/toby.eli[READ(offset=2100974186496, length=90112)] and: (da3:hpt27xx0:0:3:0): WRITE(10). CDB: 2a 0 ef cc 10 90 0 0 8 0 (da3:hpt27xx0:0:3:0): CAM status: Auto-Sense Retrieval Failed (da3:hpt27xx0:0:3:0): Error 5, Unretryable error GEOM_ELI: Crypto WRITE request failed (error=5). label/toby.eli[WRITE(offset=2059841654784, length=4096)] So we can see that geli is failing, but zpool status command doesn't show any errors at all... The READ and WRITE columns both show 0 for the device.. Now I do know that the WRITEs are not retried, because if I do a scrub afterward, it detects cksum errors, and does properly increases the count in the CKSUM column... Now if I pull a device, it will see that the device is lost, but no matter how many read or write errors get returned by geli, zfs doesn't seem to count them... Has anyone else seen this w/ ZFS? Is it possible that it's a problem w/ geli, and not ZFS? I haven't tried to run a test w/ gnop to fail some read/writes on -current.. P.S. Please keep me cc'd, as I'm not on the list. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."