From owner-freebsd-current@FreeBSD.ORG Fri Nov 13 04:53:17 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B2A71065679; Fri, 13 Nov 2009 04:53:17 +0000 (UTC) (envelope-from rnoland@FreeBSD.org) Received: from gizmo.2hip.net (gizmo.2hip.net [64.74.207.195]) by mx1.freebsd.org (Postfix) with ESMTP id D55008FC17; Fri, 13 Nov 2009 04:53:16 +0000 (UTC) Received: from [192.168.1.4] (adsl-157-60-44.bna.bellsouth.net [70.157.60.44]) (authenticated bits=0) by gizmo.2hip.net (8.14.3/8.14.3) with ESMTP id nAD4r8ev029626 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 12 Nov 2009 23:53:10 -0500 (EST) (envelope-from rnoland@FreeBSD.org) From: Robert Noland To: Matt Reimer In-Reply-To: References: <4AD710D6.70404@buchlovice.org> Content-Type: text/plain; charset="iso-8859-2" Organization: FreeBSD Date: Thu, 12 Nov 2009 22:53:03 -0600 Message-Id: <1258087983.2303.23.camel@balrog.2hip.net> Mime-Version: 1.0 X-Mailer: Evolution 2.26.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_PBL, RDNS_DYNAMIC,SPF_SOFTFAIL autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on gizmo.2hip.net Cc: freebsd-fs@freebsd.org, Radek =?iso-8859-2?Q?Val=E1=B9ek?= , freebsd-current@freebsd.org Subject: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Nov 2009 04:53:17 -0000 On Thu, 2009-11-12 at 16:54 -0800, Matt Reimer wrote: > 2009/10/15 Radek Valášek : > > Hi, > > > > I want to ask if there is something new in adding support to > > gptzfsboot/zfsboot for reading gang-blocks? > > > > From Sun's docs: > > > > Gang blocks > > > > When there is not enough contiguous space to write a complete block, the ZIO > > pipeline will break the I/O up into smaller 'gang blocks' which can later be > > assembled transparently to appear as complete blocks. > > > > Everything works fine for me, until I rewrite kernel/world after system > > upgrade to latest one (releng_8). After this am I no longer able to boot > > from zfs raidz1 pool with following messages: > > > >>/ ZFS: i/o error - all block copies unavailable > > />/ ZFS: can't read MOS > > />/ ZFS: unexpected object set type lld > > />/ ZFS: unexpected object set type lld > > />/ > > />/ FreeBSD/i386 boot > > />/ Default: z:/boot/kernel/kernel > > />/ boot: > > />/ ZFS: unexpected object set type lld > > />/ > > />/ FreeBSD/i386 boot > > />/ Default: tank:/boot/kernel/kernel > > />/ boot: > > Radek, > > Try the attached patch (sponsored by VPOP Technologies). I found an > overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was > causing my 6x1TB raidz2 array to fail to boot. > > Apply the patch, build everything in /sys/boot, and then make sure you > update both gptzfsboot and /boot/loader. > > Robert, I'm guessing you couldn't replicate this because your array > was small enough not to result in block numbers overflowing an int. This is likely, all of my raidz tests were with vnode backed 1GB memory disks. So my largest configuration was a 6 x 1GB raidz2. > The kernel source for the corresponding functionality is in > /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:vdev_raidz_map_alloc(). > There all these variables are uint64_t, but I think unnecessarily. I > tried changing the boot loader's vdev_raidz_read() variables to all > uint64_t but then gptzfsboot would reboot itself, likely due to a > stack overflow. The attached patch just changes a few variables that, > after a quick analysis, seemed likely to overflow. > > If this looks good, would someone commit it? ps@ grabbed it up already, but I may handle the MFC for him. I have some other minor fixups in my tree right now... like teaching printf to handle %llx. Thanks for finding this... It's been really frustrating that I couldn't produce a failing system. robert. > Matt > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- Robert Noland FreeBSD