Date: Wed, 21 Nov 2012 19:08:29 +0100 From: Willem Jan Withagen <wjw@digiware.nl> To: Andriy Gapon <avg@FreeBSD.org>, "stable@freebsd.org" <stable@freebsd.org> Subject: Re: Some new hardware with 9.1 does not reboot easily Message-ID: <50AD189D.4040902@digiware.nl> In-Reply-To: <50AD17E4.50104@FreeBSD.org> References: <50ACA518.4050309@digiware.nl> <50ACEEFF.8010001@FreeBSD.org> <50AD0A20.2070408@digiware.nl> <50AD0AC2.5070804@FreeBSD.org> <50AD0B29.6060602@FreeBSD.org> <50AD0F00.5020600@digiware.nl> <50AD13EE.8050901@digiware.nl> <50AD17E4.50104@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2012-11-21 19:05, Andriy Gapon wrote: > on 21/11/2012 19:48 Willem Jan Withagen said the following: >> On 2012-11-21 18:27, Willem Jan Withagen wrote: >>> On 2012-11-21 18:11, Andriy Gapon wrote: >>>> on 21/11/2012 19:09 Andriy Gapon said the following: >>>>> on 21/11/2012 19:06 Willem Jan Withagen said the following: >>>>>> Nothing that stands out for me, but then I'm not into FreeBSD kernels. >>>>>> But there is certainly no more userspace processes running other than >>>>>> reboot..... >>>>>> >>>>>> Certainly no postfix, that could complain about missing libpcre.so.1 >>>>>> That seems to be something that should have been flushed from the >>>>>> print_buffer before. >>>>>> >>>>>> What I do see i a huge amount of ZFS threads.... >>>>>> >>>>>> Rebooting from DDB is instantaneously... >>>>>> >>>>>> So I'm not certain what to look for further? >>>>> >>>>> Perhaps share the output if you are able to capture it... >>>> >>>> State of the init process should be more interesting. >>>> You can switch to it (using thread <id>) and capture its stack trace ('bt'). >>> >>> The box is not on a serial connection. >>> So capturing will be picture with Iphone and retyping it. >>> >>> init process should be 1, right? >>> I'll give it a shot >> >> Just private since it include an image of the bt... >> >> Init is there, its state is 'RLs' >> , but it does not have threads and thread 1 does not work. >> but 'bt 1' does the trick. >> >> It seems to to be waiting/working in the ZFS code to get things unmounted. > > Yeah, oops, this is a known ZFS deadlock in zfs_freebsd_reclaim -> zfs_zget path. > I may commit my fix for it to head on the next weekend. > You may share this information with the list. Any change of getting this back into 9.1? Preferably before 9.1-RELEASE, but otherwise real soon after that. I'm the perfect test guinea-pig, it happens every time I reboot. --WjW > >> Disk situation: >> 4* SATA seagate 1T (2 on sandy bridge 2 on LSI 2008) >> 4* SAS seagate 600Gb/15K all on LSI 2008 >> 2* intel SSD 540 200GB both on Sata-3 on sandy bridge >> >> ZFS config >> zfsboot= 50Gb 4way mirror on 4* SATA >> 2*2Gb cache on both SSDs >> sataraid=remainder of SATA disks in raidz >> 2*1Gb log on SSDs >> 2*50Gb cache on SSDs >> sasraid=full disk raidz of sas disks >> 2*1GB log on SSDs >> 2*100GB cache on SSDs >> >> www# zpool status -v >> pool: sasraid >> state: ONLINE >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> sasraid ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> gpt/sasraid0 ONLINE 0 0 0 >> gpt/sasraid1 ONLINE 0 0 0 >> gpt/sasraid2 ONLINE 0 0 0 >> gpt/sasraid3 ONLINE 0 0 0 >> logs >> gpt/log-sasraid0 ONLINE 0 0 0 >> gpt/log-sasraid1 ONLINE 0 0 0 >> cache >> ada0p5 ONLINE 0 0 0 >> ada1p5 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: sataraid >> state: ONLINE >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> sataraid ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> gpt/sataraid0 ONLINE 0 0 0 >> gpt/sataraid1 ONLINE 0 0 0 >> gpt/sataraid2 ONLINE 0 0 0 >> gpt/sataraid3 ONLINE 0 0 0 >> logs >> gpt/log-sataraid0 ONLINE 0 0 0 >> gpt/log-sataraid1 ONLINE 0 0 0 >> cache >> ada0p3 ONLINE 0 0 0 >> ada1p3 ONLINE 0 0 0 >> >> errors: No known data errors >> >> pool: zfsboot >> state: ONLINE >> scan: resilvered 513M in 0h0m with 0 errors on Tue Nov 20 13:41:00 2012 >> config: >> >> NAME STATE READ WRITE CKSUM >> zfsboot ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> ada2p3 ONLINE 0 0 0 >> ada3p3 ONLINE 0 0 0 >> da3p3 ONLINE 0 0 0 >> da2p3 ONLINE 0 0 0 >> cache >> ada0p1 ONLINE 0 0 0 >> ada1p1 ONLINE 0 0 0 >> >> errors: No known data errors >> >> --WjW >> >> >> > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50AD189D.4040902>