From owner-freebsd-fs@FreeBSD.ORG Sat Sep 10 07:32:30 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41808106566B; Sat, 10 Sep 2011 07:32:30 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 2B9618FC15; Sat, 10 Sep 2011 07:32:28 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA11926; Sat, 10 Sep 2011 10:32:25 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1R2I37-000Evl-EQ; Sat, 10 Sep 2011 10:32:25 +0300 Message-ID: <4E6B1285.70508@FreeBSD.org> Date: Sat, 10 Sep 2011 10:32:21 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20110907 Thunderbird/6.0.2 MIME-Version: 1.0 To: freebsd-current@FreeBSD.org, freebsd-fs@FreeBSD.org, Pawel Jakub Dawidek , Dimitry Andric References: <20110901223646.14b8aae8@o2.pl> <4E60DBBD.1040703@FreeBSD.org> <4E679D3D.1000007@FreeBSD.org> In-Reply-To: <4E679D3D.1000007@FreeBSD.org> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Sebastian Chmielewski Subject: Re: ZFS: i/o error - all block copies unavailable after upgrading to r225312 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Sep 2011 07:32:30 -0000 on 07/09/2011 19:35 Andriy Gapon said the following: > Thanks to a lot of excellent testing, debugging and analysis from Sebastian (which > went behind the scenes) we now have this patch: > http://people.freebsd.org/~avg/zfs-boot-gang.diff > > The patch introduces the following changes: > - checksum is now verified for gang header blocks > - checksum is now verified for reconstituted data of whole gang blocks > (previously it is verified only for individual gang member leaf blocks) > - reconstituted data of a whole gang block is now decompressed if the gang block > is compressed > > The last change is _the_ change. I am now investigating what looks like a miscompilation of the code by *gcc* after applying the patch. It seems that -mrtd option is to blame. I have found an older discussion about the -mrtd option causing trouble with clang: http://lists.freebsd.org/pipermail/freebsd-current/2011-August/026263.html There was a patch that made clang happy without disabling the flag, so I wonder if I made some subtle mistake in my patch. Or maybe it's better to disable mrtd altogether for the zfs boot blocks, just to stay on the safe side. Some technical details in the form of a diff with some superimposed comments: --- /home/avg/tmp/vdev_read_phys-mrtd.s 2011-09-10 01:50:54.500620864 +0300 +++ /home/avg/tmp/vdev_read_phys-no-mrtd.s 2011-09-10 01:49:59.157701373 +0300 @@ -29,16 +29,17 @@ ... <- in the code before this %edi gets assigned a pointer to a function movl 60(%ecx), %eax movl %eax, 24(%esp) movl %ecx, 20(%esp) + movl %edi, %ecx <- non-mrtd code saves the pointer popl %ebx popl %esi popl %edi <- %edi gets over-written with an unrelated value popl %ebp - jmp *%edi <- mrtd code calls some garbage code + jmp *%ecx <- non-mrtd code calls the correct code .L601: movl $5, %eax popl %ebx popl %esi popl %edi popl %ebp - ret $24 + ret The problem is in the patched vdev_read_phys function in zfsimpl.c. Unpatched version of the function doesn't seem to be affected. Any help/ideas will be greatly appreciated! -- Andriy Gapon