From owner-freebsd-stable@freebsd.org Tue Dec 8 22:37:49 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F30919D54E8 for ; Tue, 8 Dec 2015 22:37:48 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "internet06.ebureau.com", Issuer "internet06.ebureau.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id BC21B1BFE for ; Tue, 8 Dec 2015 22:37:48 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 870A34AF9B13 for ; Tue, 8 Dec 2015 16:31:40 -0600 (CST) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Nerg6oECmbna for ; Tue, 8 Dec 2015 16:31:39 -0600 (CST) Received: from square.office.ebureau.com (unknown [10.10.20.22]) by internet06.ebureau.com (Postfix) with ESMTPSA id 060574AF9B07 for ; Tue, 8 Dec 2015 16:31:39 -0600 (CST) From: Dustin Wenz Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Periodic jobs triggering panics in 10.1 and 10.2 Message-Id: <34FA7D40-8758-460D-AC14-20B21D2E3F8D@ebureau.com> Date: Tue, 8 Dec 2015 16:31:38 -0600 To: freebsd-stable@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\)) X-Mailer: Apple Mail (2.3096.5) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2015 22:37:49 -0000 I have multiple machines that have had occasional panics occur while the = daily and weekly periodic scripts run. The panic is always "Fatal trap = 18: integer divide fault while in kernel mode". I've appended a kgdb = trace below with more detail. A notable common factor is that all affected systems have zfs-based = startup disks, and 20-40 jails. Each jail has it's own filesystem that = was created by cloning the boot filesystem. I suspect this is a zfs bug = that is triggered by the access patterns in the periodic scripts. There = is significant load on the system when the scheduled processes start, = because all jails execute the same scripts at the same time. I've been able to alleviate this problem by disabling the security scans = within the jails, but leave it enabled on the root host. If this is not = a known issue in FreeBSD 10.2, I'll file a PR on it. - .Dustin Wenz Logged error: Dec 5 04:16:47 svr-033-08 kernel:=20 Dec 5 04:16:47 svr-033-08 kernel:=20 Dec 5 04:16:47 svr-033-08 kernel: Fatal trap 18: integer = divide fault while in kernel mode Dec 5 04:16:47 svr-033-08 kernel: cpuid =3D 19; apic id =3D = 27 Dec 5 04:16:47 svr-033-08 kernel: instruction pointer = =3D 0x20:0xffffffff819f54d4 Dec 5 04:16:47 svr-033-08 kernel: stack pointer = =3D 0x28:0xfffffe085fec76f0 Dec 5 04:16:47 svr-033-08 kernel: frame pointer = =3D 0x28:0xfffffe085fec7740 Dec 5 04:23:18 svr-033-08 syslogd: kernel boot file is = /boot/kernel/kernel Dec 5 04:23:18 svr-033-08 kernel: code segment = =3D base 0x0, limit 0xfffff, type 0x1b Dec 5 04:23:18 svr-033-08 kernel: =3D DPL 0, pres 1, long = 1, def32 0, gran 1 Dec 5 04:23:18 svr-033-08 kernel: processor eflags =3D = interrupt enabled, resume, IOPL =3D 0 Dec 5 04:23:18 svr-033-08 kernel: current process = =3D 20355 (find) Dec 5 04:23:18 svr-033-08 kernel: trap number = =3D 18 Dec 5 04:23:18 svr-033-08 kernel: panic: integer divide = fault Dec 5 04:23:18 svr-033-08 kernel: cpuid =3D 19 kgdb trace: Unread portion of the kernel message buffer: code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 20355 (find) trap number =3D 18 panic: integer divide fault cpuid =3D 19 KDB: stack backtrace: #0 0xffffffff80986710 at kdb_backtrace+0x60 #1 0xffffffff80949e76 at vpanic+0x126 #2 0xffffffff80949d43 at panic+0x43 #3 0xffffffff80d5d3db at trap_fatal+0x36b #4 0xffffffff80d5d05c at trap+0x75c #5 0xffffffff80d42f12 at calltrap+0x8 #6 0xffffffff819f4fc8 at dmu_tx_assign+0xf8 #7 0xffffffff81a7a887 at zfs_inactive+0x157 #8 0xffffffff81a8369d at zfs_freebsd_inactive+0xd #9 0xffffffff80e85ed7 at VOP_INACTIVE_APV+0xa7 #10 0xffffffff809ed182 at vinactive+0x102 #11 0xffffffff809ed572 at vputx+0x272 #12 0xffffffff809f40ea at sys_fchdir+0x2aa #13 0xffffffff80d5dcf7 at amd64_syscall+0x357 #14 0xffffffff80d431fb at Xfast_syscall+0xfb