FreeBSD Mail Archives

Date:      Thu, 15 Oct 2015 09:27:28 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-fs@freebsd.org
Subject:   Re: Panic in ZFS during zfs recv (while snapshots being destroyed)
Message-ID:  <561FB7D0.4080107@denninger.net>
In-Reply-To: <55DF76AA.3040103@denninger.net>
References:  <55BB443E.8040801@denninger.net> <55CF7926.1030901@denninger.net> <55DF7191.2080409@denninger.net> <sig.0681f4fd27.ADD991B6-BCF2-4B11-A5D6-EF1DB585AA33@chittenden.org> <55DF76AA.3040103@denninger.net>


[-- Attachment #1 --]
Got another one of these this morning, after a long absence...

Same symptom -- happened during a backup (send/receive) and it's in the
same place -- when the snapshot is unmounted.  I have a clean dump and
this is against a quite-recent checkout, so the most-currently-rolled
forward ZFS changes are in this kernel.


Fatal trap 12: page fault while in kernel mode
cpuid = 14; apic id = 34
fault virtual address   = 0x378
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80940ae0
stack pointer           = 0x28:0xfffffe066796c680
frame pointer           = 0x28:0xfffffe066796c700
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 81187 (zfs)
trap number             = 12
panic: page fault
cpuid = 14
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe066796c160
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe066796c210
vpanic() at vpanic+0x126/frame 0xfffffe066796c250
panic() at panic+0x43/frame 0xfffffe066796c2b0
trap_fatal() at trap_fatal+0x36b/frame 0xfffffe066796c310
trap_pfault() at trap_pfault+0x2ed/frame 0xfffffe066796c3b0
trap() at trap+0x47a/frame 0xfffffe066796c5c0
calltrap() at calltrap+0x8/frame 0xfffffe066796c5c0
--- trap 0xc, rip = 0xffffffff80940ae0, rsp = 0xfffffe066796c680, rbp =
0xfffffe
066796c700 ---
__mtx_lock_sleep() at __mtx_lock_sleep+0x1b0/frame 0xfffffe066796c700
dounmount() at dounmount+0x58/frame 0xfffffe066796c780
zfs_unmount_snap() at zfs_unmount_snap+0x114/frame 0xfffffe066796c7a0
dsl_dataset_user_release_impl() at
dsl_dataset_user_release_impl+0x103/frame 0xf
ffffe066796c920
dsl_dataset_user_release_onexit() at
dsl_dataset_user_release_onexit+0x5c/frame
0xfffffe066796c940
zfs_onexit_destroy() at zfs_onexit_destroy+0x56/frame 0xfffffe066796c970
zfsdev_close() at zfsdev_close+0x52/frame 0xfffffe066796c990
devfs_fpdrop() at devfs_fpdrop+0xa9/frame 0xfffffe066796c9b0
devfs_close_f() at devfs_close_f+0x45/frame 0xfffffe066796c9e0
_fdrop() at _fdrop+0x29/frame 0xfffffe066796ca00
closef() at closef+0x21e/frame 0xfffffe066796ca90
closefp() at closefp+0x98/frame 0xfffffe066796cad0
amd64_syscall() at amd64_syscall+0x35d/frame 0xfffffe066796cbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe066796cbf0
--- syscall (6, FreeBSD ELF64, sys_close), rip = 0x801a01f5a, rsp =
0x7fffffffd728, rbp = 0x7fffffffd740 ---
Uptime: 3d15h37m26s


On 8/27/2015 15:44, Karl Denninger wrote:
> No, but that does sound like it might be involved.....
>
> And yeah, this did start when I moved the root pool to a mirrored pair
> of Intel 530s off a pair of spinning-rust WD RE4s.... (The 530s are darn
> nice performance-wise, reasonably inexpensive and thus very suitable for
> a root filesystem drive and they also pass the "pull the power cord"
> test, incidentally.)
>
> You may be onto something -- I'll try shutting it off, but due to the
> fact that I can't make this happen and it's a "every week or two" panic,
> but ALWAYS when the zfs send | zfs recv is running AND it's always on
> the same filesystem it will be a fair while before I know if it's fixed
> (like over a month, given the usual pattern here, as that would be 4
> "average" periods without a panic).....
>
> I also wonder if I could tune this out with some of the other TRIM
> parameters instead of losing it entirely.
>
> vfs.zfs.trim.max_interval: 1
> vfs.zfs.trim.timeout: 30
> vfs.zfs.trim.txg_delay: 32
> vfs.zfs.trim.enabled: 1
> vfs.zfs.vdev.trim_max_pending: 10000
> vfs.zfs.vdev.trim_max_active: 64
> vfs.zfs.vdev.trim_min_active: 1
>
> That it's panic'ing on a mtx_lock_sleep might point this way.... the
> trace shows it coming from a zfs_onexit_destroy, which ends up calling
> zfs_unmount_snap() and then it blows in dounmount() while executing
> mtx_lock_sleep().
>
> I do wonder if I'm begging for new and innovative performance issues if
> I run with TRIM off for an extended period of time, however..... :-)
>
>
> On 8/27/2015 15:30, Sean Chittenden wrote:
>> Have you tried disabling TRIM?  We recently ran in to an issue where a `zfs delete` on a large dataset caused the host to panic because TRIM was tripping over the ZFS deadman timer.  Disabling TRIM worked as  valid workaround for us.  ?  You mentioned a recent move to SSDs, so this can happen, esp after the drive has experienced a little bit of actual work.  ?  -sc
>>
>>
>> --
>> Sean Chittenden
>> sean@chittenden.org
>>
>>
>>> On Aug 27, 2015, at 13:22, Karl Denninger <karl@denninger.net> wrote:
>>>
>>> On 8/15/2015 12:38, Karl Denninger wrote:
>>>> Update:
>>>>
>>>> This /appears /to be related to attempting to send or receive a
>>>> /cloned /snapshot.
>>>>
>>>> I use /beadm /to manage boot environments and the crashes have all
>>>> come while send/recv-ing the root pool, which is the one where these
>>>> clones get created.  It is /not /consistent within a given snapshot
>>>> when it crashes and a second attempt (which does a "recovery"
>>>> send/receive) succeeds every time -- I've yet to have it panic twice
>>>> sequentially.
>>>>
>>>> I surmise that the problem comes about when a file in the cloned
>>>> snapshot is modified, but this is a guess at this point.
>>>>
>>>> I'm going to try to force replication of the problem on my test system.
>>>>
>>>> On 7/31/2015 04:47, Karl Denninger wrote:
>>>>> I have an automated script that runs zfs send/recv copies to bring a
>>>>> backup data set into congruence with the running copies nightly.  The
>>>>> source has automated snapshots running on a fairly frequent basis
>>>>> through zfs-auto-snapshot.
>>>>>
>>>>> Recently I have started having a panic show up about once a week during
>>>>> the backup run, but it's inconsistent.  It is in the same place, but I
>>>>> cannot force it to repeat.
>>>>>
>>>>> The trap itself is a page fault in kernel mode in the zfs code at
>>>>> zfs_unmount_snap(); here's the traceback from the kvm (sorry for the
>>>>> image link but I don't have a better option right now.)
>>>>>
>>>>> I'll try to get a dump, this is a production machine with encrypted swap
>>>>> so it's not normally turned on.
>>>>>
>>>>> Note that the pool that appears to be involved (the backup pool) has
>>>>> passed a scrub and thus I would assume the on-disk structure is ok.....
>>>>> but that might be an unfair assumption.  It is always occurring in the
>>>>> same dataset although there are a half-dozen that are sync'd -- if this
>>>>> one (the first one) successfully completes during the run then all the
>>>>> rest will as well (that is, whenever I restart the process it has always
>>>>> failed here.)  The source pool is also clean and passes a scrub.
>>>>>
>>>>> traceback is at http://www.denninger.net/kvmimage.png; apologies for the
>>>>> image traceback but this is coming from a remote KVM.
>>>>>
>>>>> I first saw this on 10.1-STABLE and it is still happening on FreeBSD
>>>>> 10.2-PRERELEASE #9 r285890M, which I updated to in an attempt to see if
>>>>> the problem was something that had been addressed.
>>>>>
>>>>>
>>>> -- 
>>>> Karl Denninger
>>>> karl@denninger.net <mailto:karl@denninger.net>
>>>> /The Market Ticker/
>>>> /[S/MIME encrypted email preferred]/
>>> Second update: I have now taken another panic on 10.2-Stable, same deal,
>>> but without any cloned snapshots in the source image. I had thought that
>>> removing cloned snapshots might eliminate the issue; that is now out the
>>> window.
>>>
>>> It ONLY happens on this one filesystem (the root one, incidentally)
>>> which is fairly-recently created as I moved this machine from spinning
>>> rust to SSDs for the OS and root pool -- and only when it is being
>>> backed up by using zfs send | zfs recv (with the receive going to a
>>> different pool in the same machine.)  I have yet to be able to provoke
>>> it when using zfs send to copy to a different machine on the same LAN,
>>> but given that it is not able to be reproduced on demand I can't be
>>> certain it's timing related (e.g. performance between the two pools in
>>> question) or just that I haven't hit the unlucky combination.
>>>
>>> This looks like some sort of race condition and I will continue to see
>>> if I can craft a case to make it occur "on demand"
>>>
>>> -- 
>>> Karl Denninger
>>> karl@denninger.net <mailto:karl@denninger.net>
>>> /The Market Ticker/
>>> /[S/MIME encrypted email preferred]/
>>
>> %SPAMBLOCK-SYS: Matched [+Sean Chittenden <sean@chittenden.org>], message ok
>>

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

[-- Attachment #2 --]
0�	*�H��
��0�10
	`�He0�	*�H��
��_0�[0�C�)0
	*�H��
0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*�H��
	Cuda Systems LLC CA0
150421022159Z
200419022159Z0Z10	UUS10UFlorida10U
Cuda Systems LLC10UKarl Denninger (OCSP)0�"0
	*�H��
�0�
���X�@v�kY�
�T��q/v�E�]��5#�֯�MX����\8��L�J/V?�5���Da�+
��sJ��c�*��/�r�{ȼ�n��S�+�w"�)���ąZ^�Dt��dC�OZ�� ~7��Q ��'��@���a#i�j�c۴oZdB&���!�Ӝ���-�<	�?��H���N���5���y
5�}F�|ef゘��"V��لi��o��7��4���zn�">����a����1q�Wuɖ�b�F��e�GE�&�3(��K�h����ix�G�3���!��#�e_X�Ƭ����Ϝ/,��$�+�;�4y��'�B�z<qT�9����_?rRU�pn�5
��Jn&R��x/�p J�yel�*�pN�8�/#�9�u����/��YP�E��C)T����Y>��~/˘N[������v��yi���DKˉ�,�^�"� ?�$��T8����v�&�����K�%z��8�C @?�K{�9�f`��+���@,|����M��bia���0��07++0)0'+0�http://cudasystems.net:88880	U00	`�H��B�0U�0,	`�H��B
OpenSSL Generated Certificate0U��-��h\F����f ��Y0U#0�$q���}��ݽ�ʒ����m50U0�karl@denninger.net0
	*�H��
�Ow�b��a�bɺx�&��Uk�[��(O�j��!����%�p��MQ�0I�!#�Q��H��}.>~2���&D}�<�wm_>�V6�v��]�f��>=�N�n�+8;q �wfΰ����/��R�LyU��G#�b�}n�!D����ր_��up�|��_�ǰ��c��/�%ۥ���
�nN8:�d��;�-�UJ��d/�m��1~Vނי���nN I˾$t�F1&}�|?q?�\đ�X��ԑ�&\�4V�<lK������ۮ3%Am_����(��q����-(c����Ae�G�X)f}�-˥6c��v~��K�g�8m~v��;|�9�:-i�A����P��қ�6�ېn�-���.)�<[�$KJ�t�����t/L4�ᖣ�^Cm�u4v�b{+�B�G�$M���0c�\��[M�R�|�0FԸ�����P&7����8�"4p������#���}��DZ�9;V�9�#>�S�w"��[�UP7�1�0�0��0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*�H��
	Cuda Systems LLC CA)0
	`�He��M0	*�H��
	1	*�H��
0	*�H��
	1
151015142728Z0O	*�H��
	1B@n@ &�~sr�N�/˘��U��˚U �紛���b_���i���3j�����|�����
0l	*�H��
	1_0]0	`�He*0	`�He0
*�H��
0*�H��
�0
*�H��
@0+0
*�H��
(0��	+�71��0��0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*�H��
	Cuda Systems LLC CA)0��*�H��
	1�����0��10	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*�H��
	Cuda Systems LLC CA)0
	*�H��
�N[��t�,g�|LP��Cx���'���ô/�>�W[=bÎ�:�7��h���*���*һ%yuj?پé�I�ö1RCC��EjI����t0@r9KU�)�O|�z��0�?T�_�F�E��̈́h�9�\�V��X2�	D<5--]@|~&�O��k�.O�Wj�n���{E���j�h/��(bB�{��cc��?��0\'�kn 
��ĳ�F��T��뵠*
O�����dv�݂pЄk��t��QX)ٸ��Y>����W��RG����ff]�
���~Ixp���
������׶Wg>1)~d�&4�A&�/3뢣��V����W��#���g�=��;�����;ބ]��:���E��WL`��%�&�!i��.�x�J��E6+P�<�@��`�CُTHi���0j=asv���>r>�ꔱ����&��Z
�Ĝ)�
 �	B"g�j�u�N!g/�<J^�/�s~<���嬨V�fyV�e�#Q1�3��

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?561FB7D0.4080107>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation