From owner-freebsd-current@FreeBSD.ORG Mon Oct 28 19:00:33 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0B9F92FE for ; Mon, 28 Oct 2013 19:00:33 +0000 (UTC) (envelope-from ohartman@zedat.fu-berlin.de) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9608B25F1 for ; Mon, 28 Oct 2013 19:00:32 +0000 (UTC) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost1.zedat.fu-berlin.de (Exim 4.80.1) with esmtp (envelope-from ) id <1Vas3C-003JHb-N8>; Mon, 28 Oct 2013 20:00:30 +0100 Received: from e179078137.adsl.alicedsl.de ([85.179.78.137] helo=thor.walstatt.dyndns.org) by inpost2.zedat.fu-berlin.de (Exim 4.80.1) with esmtpsa (envelope-from ) id <1Vas3C-002z0b-Hy>; Mon, 28 Oct 2013 20:00:30 +0100 Date: Mon, 28 Oct 2013 20:00:24 +0100 From: "O. Hartmann" To: "Steven Hartland" Subject: Re: ZFS buggy in CURRENT? Stuck in [zio->io_cv] forever! Message-ID: <20131028200024.0c68f149@thor.walstatt.dyndns.org> In-Reply-To: References: <20131027134039.574849f5@thor.walstatt.dyndns.org> Organization: FU Berlin X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; amd64-portbld-freebsd11.0) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/AVtvHdWIEWF_f0qE1vXHPWE"; protocol="application/pgp-signature" X-Originating-IP: 85.179.78.137 Cc: FreeBSD CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Oct 2013 19:00:33 -0000 --Sig_/AVtvHdWIEWF_f0qE1vXHPWE Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Sun, 27 Oct 2013 16:32:13 -0000 "Steven Hartland" wrote: Hello all, after a third attempt, I realised that some remnant labels seem to cause the problem. Those labels didn't go away with "zpool create -f" or "zfs clearlabel provider", I had to issue "zfs destroy -F provider" to ensure that everything is cleared out. After the last unsuccessful attempt, I waited 14 hours for the "busy drives" as reported and they didn't stop doing something after that time, so I rebooted the box. Besides the confusion about how to proper use ZFS (I miss a documentation as a normal user and not being considered a core developer when using ZFS, several BLOGs have outdated data), there is still this issue with this nasty blocking of the whole system, only solveable by a hard reset. After the pool has been successfully created and after a snapshot has been received via -vdF option, a reimport of the pool wasn't possible as described below and any attempt to have pools listed for import (zfs import) ended up in a stuck console, uninteruptable by no kill or Ctrl-C. The demaged pool's drives showed some action, but even the pools considered unharmed didn't show up. This total-blockade also prevented the system from properly rebooting - a "shutdown -r" or "reboot" ended up in waiting for eternity after the last block has been synchronised - only power off or full reset could bring the box to life again. I think this is not intended and can be considered a bug? Thanks for the patience. oh >=20 > ----- Original Message -----=20 > From: "O. Hartmann" > >=20 > > I have setup a RAIDZ pool comprised from 4 3TB HDDs. To maintain 4k > > block alignment, I followed the instructions given on several sites > > and I'll sketch them here for the protocol. > >=20 > > The operating system is 11.0-CURRENT AND 10.0-BETA2. > >=20 > > create a GPT partition on each drive and add one whole-covering > > partition with the option > >=20 > > gpart add -t freebsd-zfs -b 1M -l disk0[0-3] ada[3-6] > >=20 > > gnop create -S4096 gtp/disk[3-6] > >=20 > > Because I added a disk to an existing RAIDZ, I exported the former > > ZFS pool, then I deleted on each disk the partition and then > > destroyed the GPT scheme. The former pool had a ZIL and CACHE > > residing on the same SSD, partioned. I didn't kill or destroy the > > partitions on that SSD. To align 4k blocks, I also created on the > > existing gpt/log00 and gpt/cache00 via=20 > >=20 > > gnop create -S4096 gpt/log00|gpt/cache00 > >=20 > > the NOP overlays. > >=20 > > After I created a new pool via zpool create POOL gpt/disk0[0-3].nop > > log gpt/log00.nop cache gpt/cache00.nop >=20 > You don't need any of the nop hax in 10 or 11 any more as it has > proper sector size detection. The caviate for this is when you have a > disk which adervtises 512b sectors but is 4k and we dont have a 4k > quirk in the kernel for it yet. >=20 > If you anyone comes across a case of this feel free to drop me the > details from camcontrol >=20 > If due to this you still need to use the gnop hack then you only need > to apply it to 1 device as the zpool create uses the largest ashift > from the disks. >=20 > I would then as the very first step export and import as at this point > there is much less data on the devices to scan through, not that > this should be needed but... >=20 >=20 > > I "received" a snapshot taken and sent to another storage array, > > after I the newly created pool didn't show up any signs of illness > > or corruption. > >=20 > > After ~10 hours of receiving the backup, I exported that pool > > amongst the backup pool, destroyed the appropriate .nop device > > entries via=20 > >=20 > > gnop destroy gpt/disk0[0-3] > >=20 > > and the same for cache and log and tried to check via=20 > >=20 > > zpool import > >=20 > > whether my pool (as well as the backup pool) shows up. And here the > > nasty mess starts! > >=20 > > The "zpool import" command issued on console is now stuck for hours > > and can not be interrupted via Ctrl-C! No pool shows up! Hitting > > Ctrl-T shows a state like > >=20 > > ... cmd: zpool 4317 [zio->io_cv]: 7345.34r 0.00 [...] > >=20 > > Looking with=20 > >=20 > > systat -vm 1 > >=20 > > at the trhoughput of the CAM devices I realise that two of the four > > RAIDZ-comprising drives show activities, having 7000 - 8000 tps and > > ~ 30 MB/s bandwidth - the other two zero! > >=20 > > And the pool is still inactive, the console is stuck. > >=20 > > Well, this made my day! At this point, I try to understand what's > > going wrong and try to recall what I did the last time different > > when the same procedure on three disks on the same hardware worked > > for me. > >=20 > > Now after 10 hours copy orgy and the need for the working array I > > start believing that using ZFS is still peppered with too many > > development-like flaws rendering it risky on FreeBSD. Colleagues > > working on SOLARIS on ZFS I consulted never saw those > > stuck-behaviour like I realise this moment. >=20 > While we only run 8.3-RELEASE currently, as we've decided to skip 9.X > and move straight to 10 once we've tested, we've found ZFS is not only > very stable but it now become critical to the way we run things. >=20 > > I don not want to repeat the procedure again. There must be a > > possibility to import the pool - even the backup pool, which is > > working, untouched by the work, should be able to import - but it > > doesn't. If I address that pool, while this crap "zpool import" > > command is still blocking the console, not willing to die even with > > "killall -9 zpool", I can not import the backup pool via "zpool > > import BACKUP00". The console gets stuck immediately and for the > > eternity without any notice. Htting Ctrl-T says something like=20 > >=20 > > load: 3.59 cmd: zpool 46199 [spa_namespace_lock] 839.18r 0.00u > > 0.00s 0% 3036k > >=20 > > which means I can not even import the backup facility and this means > > really no fun. >=20 > I'm not sure there's enough information here to determine where any > issue may lie, but as a guess it could be that ZFS is having issues > locating the one change devices and is scanning the entire disk to > try and determine that. This would explain the IO on the one device > but not the others. >=20 > Did you per-chance have one of the disks in use for something else > and hence it may have old label information in it that wasn't cleaned > down? >=20 > Regards > Steve >=20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained > in it.=20 >=20 > In the event of misdirection, illegible or incomplete transmission > please telephone +44 845 868 1337 or return the E.mail to > postmaster@multiplay.co.uk. >=20 > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to > "freebsd-current-unsubscribe@freebsd.org" --Sig_/AVtvHdWIEWF_f0qE1vXHPWE Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQEcBAEBAgAGBQJSbrRNAAoJEOgBcD7A/5N8ffYH/12cfPoLiX+j6SqS2URtSNa4 emOt/NZ79qyiKT1P4f+0qw6pPEsGnZXmikUqOl8sfcDuPGffW9Yx76mTnj05jnSv tez8iYa62JmPIz2Q8eF+1w9JXUrxUE/b+5VOoDNsfIKtlY+O9GW9PRRfSsIu5UAL n5K8PvQoz/ztv4g8lj1BXZcMY9uSChDkrUZCLMsD7PajM5Ng0pFScgbkmWAfI2xC POixn4Fuaeb2+mNZd6OW03avsPF4vPtrJ/JxfX5sjNOs2rqgy5xgSRVwqNwc15iO k+luIcgjJ6kZJvdCMqYm5IA3ShCjt39evY5LBDsUgIsrXc6qouxEXfEze07MEEE= =FStE -----END PGP SIGNATURE----- --Sig_/AVtvHdWIEWF_f0qE1vXHPWE--