Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Jan 2017 08:50:39 -0600
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-stable@freebsd.org
Subject:   Re: Ugh -- attempted to update this morning, and got a nasty panic in ZFS....
Message-ID:  <a5a4bd1e-c7e5-8d16-6398-469e1f0bb11a@denninger.net>
In-Reply-To: <f05fcab3-ec17-17b3-3459-73256f35fbc7@denninger.net>
References:  <f05fcab3-ec17-17b3-3459-73256f35fbc7@denninger.net>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
A second attempt to come up on the new kernel was successful -- so this
had to be due to queued I/Os that were pending at the time of the
shutdown....


On 1/11/2017 08:31, Karl Denninger wrote:
> During the reboot, immediately after the daemons started up on the
> machine (the boot got beyond mounting all the disks and was well into
> starting up all the background stuff it runs), I got a double-fault.
>
> ..... (there were a LOT more of this same; it pretty clearly was a
> recursive call sequence that ran the system out of stack space)
>
> #294 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #295 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c8f27b0)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #296 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #297 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #298 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010cff0b88)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #299 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #300 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010cff0b88)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #301 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #302 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #303 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff8010c962000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #304 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #305 0xffffffff8230130e in zio_vdev_io_start (zio=0xfffff8010c962000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3127
> #306 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #307 0xffffffff822e464d in vdev_queue_io_done (zio=<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:913
> #308 0xffffffff823014c9 in zio_vdev_io_done (zio=0xfffff80102175000)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3152
> #309 0xffffffff822fdcfd in zio_execute (zio=<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1666
> #310 0xffffffff80b2585a in taskqueue_run_locked (queue=<value optimized
> out>)
>     at /usr/src/sys/kern/subr_taskqueue.c:454
> #311 0xffffffff80b26a48 in taskqueue_thread_loop (arg=<value optimized out>)
>     at /usr/src/sys/kern/subr_taskqueue.c:724
> #312 0xffffffff80a7eb05 in fork_exit (
>     callout=0xffffffff80b26960 <taskqueue_thread_loop>,
>     arg=0xfffff800b8824c30, frame=0xfffffe0667430c00)
>     at /usr/src/sys/kern/kern_fork.c:1040
> #313 0xffffffff80f87c3e in fork_trampoline ()
>     at /usr/src/sys/amd64/amd64/exception.S:611
> #314 0x0000000000000000 in ?? ()
> Current language:  auto; currently minimal
> (kgdb)
>
> .....
>
>
> NewFS.denninger.net dumped core - see /var/crash/vmcore.3
>
> Wed Jan 11 08:15:33 CST 2017
>
> FreeBSD NewFS.denninger.net 11.0-STABLE FreeBSD 11.0-STABLE #14
> r311927M: Wed Ja
> n 11 07:55:20 CST 2017    
> karl@NewFS.denninger.net:/usr/obj/usr/src/sys/KSD-SMP
>   amd64
>
> panic: double fault
>
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
>
> Fatal double fault
> rip = 0xffffffff822e3c5d
> rsp = 0xfffffe066742af90
> rbp = 0xfffffe066742b420
> cpuid = 15; apic id = 35
> panic: double fault
> cpuid = 15
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe0649ddee30
> vpanic() at vpanic+0x186/frame 0xfffffe0649ddeeb0
> panic() at panic+0x43/frame 0xfffffe0649ddef10
> dblfault_handler() at dblfault_handler+0xa2/frame 0xfffffe0649ddef30
> Xdblfault() at Xdblfault+0xac/frame 0xfffffe0649ddef30
> --- trap 0x17, rip = 0xffffffff822e3c5d, rsp = 0xfffffe066742af90, rbp =
> 0xfffff
> e066742b420 ---
>
> # Work around for this CPU from 11.x eratta
> vm.pmap.pcid_enabled=0
> #
> #
> # Try to avoid kernel stack exhaustion due to TRIM storms.
> kern.kstack_pages="6"
>
> I have kstack_pages set to "6" to try to avoid another panic that I got
> occasionally during zfs backup operations which appeared to be linked to
> "too many" TRIMs, and looks very similar to this one.
>
> I rebooted back to kernel.old, which was built in October, and the
> machine came up normally.  I'll try the newer build again and see if
> this was transient and related to delayed TRIM operations on the disks
> related to the installworld/installkernel.  But if it is then it remains
> a problem -- and setting stackpages didn't help!
>
> I've got the dump if anything in particular would be of help.
>
> The prompt to do this in the first place was the openssh CVE that was
> recently issued.....
>
>

-- 
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

[-- Attachment #2 --]
0	*H
010
	`He0	*H
\0X0@=0
	*H
010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA0
161218194535Z
211217194535Z0W10	UUS10UFlorida10U
Cuda Systems LLC10Ukarl@denninger.net0"0
	*H
0
͍fd`1ie6";fSz`5¹/?{=Ӵowjħ_fnӴMG\ҢҖ4ib}>@mJo&mM;
Q9U cj]p퐆W.2E=
^¢tzĄ'5i7_`~#dY
`]R]N%R}EXzqV@[oN	T>5AwYˡA"\v&YG]+($p:M,T?=mJkMљg*ym
L!J[./d׷?W^LysD'1
+V'~{-SSX=q-f=%&V<m4BeSet|
l2m 6iO{wv
+aHXˈ5=~é*C!?uJr3tb'3`Oe)üLxt&3N526llU
.|Cp[l?007++0)0'+0http://cudasystems.net:88880	U00	`HB0U0,	`HB
OpenSSL Generated Certificate0U/Zi
0GhG0U#0$q}ݽʒm50U0karl@denninger.net0
	*H
b%X%gwq	
ɁэrK[DMJ35W6
sz8d|qB2Cyw2PbV}
â[!W{HD7oD.TZ'w6~g( -,]R8P{*[f<1=7jGj9铚~3f2AʺN	k~@vz^j(>ͺyh2y{/9}4.45#S|<fW!.,Bss*Q+h=}l@	"q "M&6J5*,G {hɫjbNgǠ.ЃXȶ4$O.5evHlZba!4eE!x|Za1򹿈nZ5TuPvW|#G+	DZpI7S'n0 haGa@vZ	e|]Cu+))vRyY100010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA=0
	`HeM0	*H
	1	*H
0	*H
	1
170111145039Z0O	*H
	1B@ ͧq}m{4Gz.4ӅN?8sCAh$7xěI˪YDId
[1"dR(q0l	*H
	1_0]0	`He*0	`He0
*H
0*H
0
*H
@0+0
*H
(0	+710010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA=0*H
	1010	UUS10UFlorida10U	Niceville10U
Cuda Systems LLC10UCuda Systems LLC CA1"0 	*H
	Cuda Systems LLC CA=0
	*H
^+%ܣzbT4:_߽^Sj²t=KNE⭝=B0[e
DkhZW2H0ߛW@"e8R8Iwj3kx4Jl$I\'վ$j6g?l'N"cws'3ZߌťyVX/e$>IB-T3eO[v`W4$^̄(Pq5bkJA(ag ^54gJV?|ɤ`4HJ[di
>,\y&!75Q[mz;k~ 8Y*	J@$xxcS$(Vy~뵩MWw[RH(F]vށTJK0|\EXCdz$$bMbZ[S
7<80ʄf%אU2&Rtb+Z=xإ`rqIģ$_RCl&TJ	܇/`5(&D
help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a5a4bd1e-c7e5-8d16-6398-469e1f0bb11a>