Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Nov 2012 19:08:29 +0100
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        Andriy Gapon <avg@FreeBSD.org>, "stable@freebsd.org" <stable@freebsd.org>
Subject:   Re: Some new hardware with 9.1 does not reboot easily
Message-ID:  <50AD189D.4040902@digiware.nl>
In-Reply-To: <50AD17E4.50104@FreeBSD.org>
References:  <50ACA518.4050309@digiware.nl> <50ACEEFF.8010001@FreeBSD.org> <50AD0A20.2070408@digiware.nl> <50AD0AC2.5070804@FreeBSD.org> <50AD0B29.6060602@FreeBSD.org> <50AD0F00.5020600@digiware.nl> <50AD13EE.8050901@digiware.nl> <50AD17E4.50104@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2012-11-21 19:05, Andriy Gapon wrote:
> on 21/11/2012 19:48 Willem Jan Withagen said the following:
>> On 2012-11-21 18:27, Willem Jan Withagen wrote:
>>> On 2012-11-21 18:11, Andriy Gapon wrote:
>>>> on 21/11/2012 19:09 Andriy Gapon said the following:
>>>>> on 21/11/2012 19:06 Willem Jan Withagen said the following:
>>>>>> Nothing that stands out for me, but then I'm not into FreeBSD kernels.
>>>>>> But there is certainly no more userspace processes running other than
>>>>>> reboot.....
>>>>>>
>>>>>> Certainly no postfix, that could complain about missing libpcre.so.1
>>>>>> That seems to be something that should have been flushed from the
>>>>>> print_buffer before.
>>>>>>
>>>>>> What I do see i a huge amount of ZFS threads....
>>>>>>
>>>>>> Rebooting from DDB is instantaneously...
>>>>>>
>>>>>> So I'm not certain what to look for further?
>>>>>
>>>>> Perhaps share the output if you are able to capture it...
>>>>
>>>> State of the init process should be more interesting.
>>>> You can switch to it (using thread <id>) and capture its stack trace ('bt').
>>>
>>> The box is not on a serial connection.
>>> So capturing will be picture with Iphone and retyping it.
>>>
>>> init process should be 1, right?
>>> I'll give it a shot
>>
>> Just private since it include an image of the bt...
>>
>> Init is there, its state is 'RLs'
>> , but it does not have threads and thread 1 does not work.
>> but 'bt 1' does the trick.
>>
>> It seems to to be waiting/working in the ZFS code to get things unmounted.
> 
> Yeah, oops, this is a known ZFS deadlock in zfs_freebsd_reclaim -> zfs_zget path.
> I may commit my fix for it to head on the next weekend.
> You may share this information with the list.

Any change of getting this back into 9.1?
Preferably before 9.1-RELEASE, but otherwise real soon after that.

I'm the perfect test guinea-pig, it happens every time I reboot.

--WjW

> 
>> Disk situation:
>> 	4* SATA seagate 1T (2 on sandy bridge 2 on LSI 2008)
>> 	4* SAS seagate 600Gb/15K all on LSI 2008
>> 	2* intel SSD 540 200GB both on Sata-3 on sandy bridge
>>
>> ZFS config
>> zfsboot= 50Gb 4way mirror on 4* SATA
>> 		2*2Gb cache on both SSDs
>> sataraid=remainder of SATA disks in raidz
>> 		2*1Gb log on SSDs
>> 		2*50Gb cache on SSDs
>> sasraid=full disk raidz of sas disks
>> 		2*1GB log on SSDs
>> 		2*100GB cache on SSDs
>>
>> www# zpool status -v
>>   pool: sasraid
>>  state: ONLINE
>>   scan: none requested
>> config:
>>
>>         NAME                STATE     READ WRITE CKSUM
>>         sasraid             ONLINE       0     0     0
>>           raidz1-0          ONLINE       0     0     0
>>             gpt/sasraid0    ONLINE       0     0     0
>>             gpt/sasraid1    ONLINE       0     0     0
>>             gpt/sasraid2    ONLINE       0     0     0
>>             gpt/sasraid3    ONLINE       0     0     0
>>         logs
>>           gpt/log-sasraid0  ONLINE       0     0     0
>>           gpt/log-sasraid1  ONLINE       0     0     0
>>         cache
>>           ada0p5            ONLINE       0     0     0
>>           ada1p5            ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>>   pool: sataraid
>>  state: ONLINE
>>   scan: none requested
>> config:
>>
>>         NAME                 STATE     READ WRITE CKSUM
>>         sataraid             ONLINE       0     0     0
>>           raidz1-0           ONLINE       0     0     0
>>             gpt/sataraid0    ONLINE       0     0     0
>>             gpt/sataraid1    ONLINE       0     0     0
>>             gpt/sataraid2    ONLINE       0     0     0
>>             gpt/sataraid3    ONLINE       0     0     0
>>         logs
>>           gpt/log-sataraid0  ONLINE       0     0     0
>>           gpt/log-sataraid1  ONLINE       0     0     0
>>         cache
>>           ada0p3             ONLINE       0     0     0
>>           ada1p3             ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>>   pool: zfsboot
>>  state: ONLINE
>>   scan: resilvered 513M in 0h0m with 0 errors on Tue Nov 20 13:41:00 2012
>> config:
>>
>>         NAME        STATE     READ WRITE CKSUM
>>         zfsboot     ONLINE       0     0     0
>>           mirror-0  ONLINE       0     0     0
>>             ada2p3  ONLINE       0     0     0
>>             ada3p3  ONLINE       0     0     0
>>             da3p3   ONLINE       0     0     0
>>             da2p3   ONLINE       0     0     0
>>         cache
>>           ada0p1    ONLINE       0     0     0
>>           ada1p1    ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> --WjW
>>
>>
>>
> 
> 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50AD189D.4040902>