Date: Sat, 21 Sep 2013 00:40:13 +0000 From: "Teske, Devin" <Devin.Teske@fisglobal.com> To: freebsd-fs <freebsd-fs@freebsd.org> Cc: Devin Teske <dteske@freebsd.org>, "Teske, Devin" <Devin.Teske@fisglobal.com> Subject: Re: zfs upgrade hang Message-ID: <13CA24D6AB415D428143D44749F57D720FBE1299@LTCFISWMSGMB21.FNFIS.com> In-Reply-To: <13CA24D6AB415D428143D44749F57D720FBE11B0@LTCFISWMSGMB21.FNFIS.com> References: <13CA24D6AB415D428143D44749F57D720FBE11B0@LTCFISWMSGMB21.FNFIS.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 20, 2013, at 5:27 PM, Teske, Devin wrote: > Hi, >=20 > Seem to be having an issue with "zfs upgrade" hanging. >=20 > Please note that "zpool upgrade" seems to be fine... it's "zfs upgrade" t= hat hangs. >=20 > The system is 8.4-STABLE @ r255470M amd64. >=20 > The dataset versions prior to upgrade are 3 and after upgrade are 5. >=20 > It doesn't always hang. But once it does, it's always in the following ^T= state: >=20 > tx->tx_sync_done_cv >=20 > You can let it sit there for minutes or hours, but it never completes or = enters a > different state. Also, you can't Ctrl-C it, you can't Ctrl-Z it, you can'= t kill it, not even > with `-9'. Further, anything like "zfs list" will hang. All the meanwhile= , the filesystem > is readable and fine. >=20 > The sure-fire way to hit this for us is to attempt a "-a" or "-r" or "-ra= " to do many datasets > at once. >=20 > However, doing one dataset at a time will work... until it too leads to t= he same state. >=20 > Once we hit this state (hung upgrade) we have to reboot. >=20 > We've been able to get through all the datasets on a box by doing one-at-= a-time and > rebooting when one hangs and ends up in this state but it's frustrating b= ecause we can > usually only do a handful at a time before hitting the problem. >=20 > Scripting it won't help. >=20 > Also, we've tried unmounting the filesystems prior to upgrade too, that d= idn't help. > Updating libraries/binaries to r255747 didn't seem to help either. I gues= s next step is to > update the kernel to latest stable/8 (which is probably not far ahead of = r255470). >=20 > Advice? Before you chime-in, I think I might have more to add to the puzzle. It would seem that the LSI-RAID1 pool is the problem. We always hang when trying to execute "zfs upgrade LSI-RAID1" despite the fact that we've done a "zfs upgrade" of everything underneath it. We've also done a "zfs upgrade" of other pools and their descendants on the same system without this hang. So I was thinking... what makes the "LSI-RAID1" pool different from, say, t= he "NEC1_POOL_A" pool. The answer is mount-point. LSI-RAID1 does not have a mountpoint, while everything else does. Here's the layout we have: hvm2b# zfs get version NAME PROPERTY VALUE SOURCE LSI-RAID1 version 3 - LSI-RAID1/vm version 5 - LSI-RAID1/vm/golden0 version 5 - LSI-RAID1/vm/golden0@pre-cfg0-snap1 version 3 - LSI-RAID1/vm/golden0@non-cfg0-snap1 version 3 - LSI-RAID1/vm/golden0@zxfer_4709_20130905025825 version 3 - LSI-RAID1/vm/ipu0c version 5 - LSI-RAID1/vm/ipu1c version 5 - LSI-RAID1/vm/ipu2c version 5 - LSI-RAID1/vm/oos0c version 5 - LSI-RAID1/vmbak version 5 - LSI-RAID1/vmbak/vm version 5 - NEC2_POOL_B version 5 - NEC2_POOL_B/oos0c version 5 - As you can see, while trying to work around this hang, we've been able to upgrade all the datasets with the exception of the one culprit (at the top). So... Did I discover a bug? Perhaps relating to "zfs upgrade" touching datasets t= hat don't have a mountpoint set? --=20 Devin _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13CA24D6AB415D428143D44749F57D720FBE1299>