Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Sep 2011 10:32:21 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        freebsd-current@FreeBSD.org, freebsd-fs@FreeBSD.org, Pawel Jakub Dawidek <pjd@FreeBSD.org>, Dimitry Andric <dim@FreeBSD.org>
Cc:        Sebastian Chmielewski <chmiels@o2.pl>
Subject:   Re: ZFS: i/o error - all block copies unavailable after upgrading to r225312
Message-ID:  <4E6B1285.70508@FreeBSD.org>
In-Reply-To: <4E679D3D.1000007@FreeBSD.org>
References:  <20110901223646.14b8aae8@o2.pl> <4E60DBBD.1040703@FreeBSD.org> <4E679D3D.1000007@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 07/09/2011 19:35 Andriy Gapon said the following:
> Thanks to a lot of excellent testing, debugging and analysis from Sebastian (which
> went behind the scenes) we now have this patch:
> http://people.freebsd.org/~avg/zfs-boot-gang.diff
> 
> The patch introduces the following changes:
> - checksum is now verified for gang header blocks
> - checksum is now verified for reconstituted data of whole gang blocks
>   (previously it is verified only for individual gang member leaf blocks)
> - reconstituted data of a whole gang block is now decompressed if the gang block
> is compressed
> 
> The last change is _the_ change.

I am now investigating what looks like a miscompilation of the code by *gcc*
after applying the patch.  It seems that -mrtd option is to blame.
I have found an older discussion about the -mrtd option causing trouble with clang:
http://lists.freebsd.org/pipermail/freebsd-current/2011-August/026263.html
There was a patch that made clang happy without disabling the flag, so I wonder
if I made some subtle mistake in my patch.  Or maybe it's better to disable mrtd
altogether for the zfs boot blocks, just to stay on the safe side.

Some technical details in the form of a diff with some superimposed comments:
--- /home/avg/tmp/vdev_read_phys-mrtd.s	2011-09-10 01:50:54.500620864 +0300
+++ /home/avg/tmp/vdev_read_phys-no-mrtd.s	2011-09-10 01:49:59.157701373 +0300
@@ -29,16 +29,17 @@
	... <- in the code before this %edi gets assigned a pointer to a function
 	movl	60(%ecx), %eax
 	movl	%eax, 24(%esp)
 	movl	%ecx, 20(%esp)
+	movl	%edi, %ecx  <- non-mrtd code saves the pointer
 	popl	%ebx
 	popl	%esi
 	popl	%edi <- %edi gets over-written with an unrelated value
 	popl	%ebp
-	jmp	*%edi  <- mrtd code calls some garbage code
+	jmp	*%ecx  <- non-mrtd code calls the correct code
 .L601:
 	movl	$5, %eax
 	popl	%ebx
 	popl	%esi
 	popl	%edi
 	popl	%ebp
-	ret	$24
+	ret

The problem is in the patched vdev_read_phys function in zfsimpl.c.
Unpatched version of the function doesn't seem to be affected.

Any help/ideas will be greatly appreciated!

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E6B1285.70508>