Date: Wed, 20 Aug 2008 09:47:38 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Colin Moller <colin@lefty.tv> Cc: swank@storefront.com, colin@storefront.com, current@freebsd.org, elo@storefront.com Subject: Re: zpool import hanging on unexpectedly-rebooted machine Message-ID: <20080820074738.GA1701@garage.freebsd.pl> In-Reply-To: <48A95C6F.2010002@lefty.tv> References: <48A95C6F.2010002@lefty.tv>
next in thread | previous in thread | raw e-mail | index | archive | help
--FL5UXtIhxfXey3p5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 18, 2008 at 04:26:39AM -0700, Colin Moller wrote: > Hey all, >=20 > I've got an interestingly frustrating problem on my hands with our=20 > 7.0-STABLE boxes running ZFS. Sun X4500 box running amd64, 16GB of=20 > RAM., 46x1TB disks in RAIDZ1. (other two for the OS.) >=20 > Uname for the box is: > FreeBSD sf-nas1-c160a.storefront.com 7.0-STABLE FreeBSD 7.0-STABLE #1:=20 > Sat May 31 14:54:22 PDT 2008 =20 > root@sf-nas1-c160a.storefront.com:/usr/obj/usr/src/sys/X4500 amd64 >=20 > The box has been running relatively reliably for some months now, but=20 > our hosting provider decided to reboot it on us without asking. After=20 > the box came back, it had lost /boot/zfs/zpool.cache, so I needed to=20 > reimport the only zpool on the machine (named zfsdata). >=20 > Running zpool import gives me the output I'm expecting, showing a single= =20 > zpool called zfsdata, status of ONLINE, and all the disks are showing up. >=20 > However, when I run zpool import -f <numerical_pool_id>, the zpool=20 > command simply hangs up with no disk and no CPU activity. I've run=20 > truss on the zpool import, and the last thing I see happening is: >=20 > open("/dev/ad96",O_RDONLY,030115000) =3D 6 (0x6) > ioctl(6,DIOCGIDENT,0xffff9480) =3D 0 (0x0) > close(6) =3D 0 (0x0) >=20 > After turning on vfs.zfs.debug, I also see this on the console: >=20 > zfs_ereport_post:293[1]: time=3D1219057172.795893475 ereport_version=3D0= =20 > class=3Dfs.zfs.checksum zfs_scheme_version=3D0 pool=3Dzfsdata=20 > pool_guid=3D316648131406719055 pool_context=3D2=20 > vdev_guid=3D7326417523786577584 vdev_type=3Ddisk vdev_path=3D/dev/ad12=20 > vdev_devid=3Dad:GTF000PAHX5TMF parent_guid=3D6708978418893991394=20 > parent_type=3Draidz zio_err=3D0 zio_offset=3D89290496000 zio_size=3D512= =20 > zio_object=3D132 zio_level=3D0 zio_blkid=3D244 if I read this correctly, it reports checksum error on disk /dev/ad12, but because this is RAIDZ, it probably tries to self-heal and maybe something here goes wrong. I never saw similar problem, so I'm not sure how to help you. Even if upgrading to -CURRENT is not an option for you, maybe you can still install -CURRENT on a USB pendriver and recompile it with new patch? You may also try to remove this disk (ad12) and see if it behaves any better. Anyway, please keep me informed on what's going on. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --FL5UXtIhxfXey3p5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIq8wWForvXbEpPzQRAvigAJwN2eD3656SWtHwJFCdTwqSjOeDLQCgwy8/ vy0+MJ+BSbc286s0MxHy2Sk= =GBxs -----END PGP SIGNATURE----- --FL5UXtIhxfXey3p5--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080820074738.GA1701>