Date: Fri, 19 Oct 2018 13:22:21 -0400 From: Mark Johnston <markj@freebsd.org> To: Sebastian Wojtczak <sebastian.wojtczak@gmail.com> Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD 11.2 kernel crash when dd Message-ID: <20181019172221.GA21156@raichu> In-Reply-To: <CAEfQnDmWE8mq8=XqvPu3zn0S0kOka274T7O7_GXXT=Xg3ObcgA@mail.gmail.com> References: <CAEfQnDmWE8mq8=XqvPu3zn0S0kOka274T7O7_GXXT=Xg3ObcgA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 19, 2018 at 01:10:15PM +0200, Sebastian Wojtczak wrote: > Hi, > > I would like to report a kernel crash while dd on ssd drive. > > Just found that my PC crashed several times during below command: > dd if=/dev/ada2 of=file_name bs=10m. > > I was trying to make an image from my ssd drive. Once dump file hit size > 41G or 52G kernel crashes and reboot the system. > > Oct 18 12:30:11 username syslogd: kernel boot file is /boot/kernel/kernel > Oct 18 12:30:11 username kernel: > Oct 18 12:30:11 username kernel: > Oct 18 12:30:11 username kernel: Fatal trap 12: page fault while in kernel > mode > Oct 18 12:30:11 username kernel: cpuid = 1; apic id = 01 > Oct 18 12:30:11 username kernel: fault virtual address = 0x5a > Oct 18 12:30:11 username kernel: fault code = supervisor read > data, page not present > Oct 18 12:30:11 username kernel: instruction pointer = > 0x20:0xffffffff80e67f6d > Oct 18 12:30:11 username kernel: stack pointer = > 0x28:0xfffffe084b408f40 > Oct 18 12:30:11 username kernel: frame pointer = > 0x28:0xfffffe084b408f80 > Oct 18 12:30:11 username kernel: code segment = base 0x0, limit > 0xfffff, type 0x1b > Oct 18 12:30:11 username kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 > Oct 18 12:30:11 username kernel: processor eflags = interrupt > enabled, resume, IOPL = 0 > Oct 18 12:30:11 username kernel: current process = 0 > (zio_write_issue_8) > Oct 18 12:30:11 username kernel: trap number = 12 > Oct 18 12:30:11 username kernel: panic: page fault > Oct 18 12:30:11 username kernel: cpuid = 1 > Oct 18 12:30:11 username kernel: KDB: stack backtrace: > Oct 18 12:30:11 username kernel: #0 0xffffffff80b50087 at kdb_backtrace+0x67 > Oct 18 12:30:11 username kernel: #1 0xffffffff80b099f7 at vpanic+0x177 > Oct 18 12:30:11 username kernel: #2 0xffffffff80b09873 at panic+0x43 > Oct 18 12:30:11 username kernel: #3 0xffffffff80fe105f at trap_fatal+0x35f > Oct 18 12:30:11 username kernel: #4 0xffffffff80fe10b9 at trap_pfault+0x49 > Oct 18 12:30:11 username kernel: #5 0xffffffff80fe0887 at trap+0x2c7 > Oct 18 12:30:11 username kernel: #6 0xffffffff80fc04cc at calltrap+0x8 > Oct 18 12:30:11 username kernel: #7 0xffffffff80e56df2 at kmem_back+0xf2 > Oct 18 12:30:11 username kernel: #8 0xffffffff80e56cd0 at kmem_malloc+0x60 > Oct 18 12:30:11 username kernel: #9 0xffffffff80e4e752 at > keg_alloc_slab+0xe2 > Oct 18 12:30:11 username kernel: #10 0xffffffff80e5118e at > keg_fetch_slab+0x14e > Oct 18 12:30:11 username kernel: #11 0xffffffff80e509a4 at > zone_fetch_slab+0x64 > Oct 18 12:30:11 username kernel: #12 0xffffffff80e50a7f at zone_import+0x3f > Oct 18 12:30:11 username kernel: #13 0xffffffff80e4d199 at > uma_zalloc_arg+0x3d9 > Oct 18 12:30:11 username kernel: #14 0xffffffff832d2ab2 at > zio_write_compress+0x1e2 > Oct 18 12:30:11 username kernel: #15 0xffffffff832d174c at zio_execute+0xac > Oct 18 12:30:11 username kernel: #16 0xffffffff80b617e4 at > taskqueue_run_locked+0x154 > Oct 18 12:30:11 username kernel: #17 0xffffffff80b62918 at > taskqueue_thread_loop+0x98 > Oct 18 12:30:11 username kernel: Uptime: 5m50s > > One virtual machine is started with bhyve at startup but even if I shutdown > it, same crash happen. Disabling vmm does not help but only extend time to > crash during ssd dump. > > Current zfs setup is zraid on 3 (500GB) hdd drives with compress=on. Drive > ada0 is not part of zraid and is not attached/mount what ever. > > Any help how to investigate it is appreciated. The stack suggests a bug in the kmem_* KPI, but I'm having trouble seeing the problem. In particular, the fault address suggests that we crashed while testing (m->flags & PG_ZERO) == 0, but it shouldn't be possible for m to be NULL there. My attempts to reproduce this on 12-CURRENT haven't yielded anything yet. Would you (or anyone else seeing the problem) be willing to share a kernel dump? I'd need the vmcore, the contents of /boot/kernel and /usr/lib/debug/boot/kernel.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20181019172221.GA21156>