Date: Sun, 6 Sep 2020 18:02:40 +0900 From: Tomoaki AOKI <junchoon@dec.sakura.ne.jp> To: freebsd-current@freebsd.org Subject: Re: Fatal trap 18 on boot after OpenZFS import Message-ID: <20200906180240.e61a2869b1258f96c3e7d398@dec.sakura.ne.jp> In-Reply-To: <20200904220301.7fac6b4008f1bc7ad8d803c9@dec.sakura.ne.jp> References: <20200904220301.7fac6b4008f1bc7ad8d803c9@dec.sakura.ne.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
Filed PR. Bug 249147 - [ZFS][Panic]Fatal trap 18 on boot after OpenZFS import https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249147 On Fri, 4 Sep 2020 22:03:01 +0900 Tomoaki AOKI <junchoon@dec.sakura.ne.jp> wrote: > Hi. > > Encountering boot failure with fatal trap 18 on boot, > happening at (maybe) just before init() starts. Possibly on > root remount by kernel or zpool import by rc.d script. > The last revision tried is r365316 (r364788 is the last tried > clean rebuild). > > The last health revision is r364744, just before actual switch > to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU. > > r364751 with diff of r364777 and r364788 (to successfully built > Without unrelated-to-OpenZFS changes) fails. > > Any suggestions and fixes are appreciated. > > > Trap screen is something like below (text attached), > typed up from relatively clear photo, so could be some typo. > > This is shown just after usual kernel startup outputs. > boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi > properly, and loader.efi seems to boot kernel properly. > > As even single user shell selection doesn't appear, loader.efi > is of r364744. But they works even if I proceeded irregular > process, > > 1)Update src tree > 2)Clean obj tree > 3)buildworld > 4)etcupdate -p > 5)buildkernel > 6)installkernel > 7)shutdown to single user WITHOUT reboot <- Irregular! > 8)installworld > 9)etcupdate > 10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...) > 11)reboot > > loader.efi looks doing its job and panics after kernel startup ends. > Needless to say, rolling back to r364744 state from stable/12 on nvd0 > Fixes the issue. > > Regards. > > ===== > > Fatal trap 18: integer divide fault while in kernel mode > cpuid = 2; apic id = 02 > instruction pointer = 0x20:0xffffffff82bfa320 > stack pointer = 0x28:0xfffffe00e20c6900 > frame pointer = 0x28:0xfffffe00e20c6960 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 27 (vdev_open) > trap number = 18 > panic: integer divide fault > cpuid = 2 > time = 16 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe00e20c6610 vpanic() at vpanic+0x182/frame fffffe00e20c6660 > panic() at panic+0x43/frame fffffe00e20c66c0 > trap_fatal() at trap_fatal+0x387/frame fffffe00e20c6720 > trap() at trap+0x8e/frame fffffe00e20c6830 > calltrap() at calltrap+0x8/frame fffffe00e20c6830 > --- trap 0x12, rip = 0xffffffff82bfa320, rsp = 0xfffffe00e20c6900, rbp > = 0xfffffe00e20c6960 --- zio_wait() at zio_wait+0x60/frame > 0xfffffe00e20c6960 vdev_open() at vdev_open+0x74d/frame > 0xfffffe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame > 0xfffffe00e20c69e0 taskq_run() at taskq_run+0x1f/frame > 0xfffffe00e20c6a00 taskqueue_run_locked() at > taskqueue_run_locked+0x181/frame 0xfffffe00e20c6a80 > taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame > 0xfffffe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame > 0xfffffe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame > 0xfffffe00e20c6af0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > KDB: enter: panic > [ thread pid 27 tid 100570 ] > Stopped at kdb_enter+0x37: movq $0,0x1091556(%rip) > db> > > ===== > > Additional info: > *Clean build with killing CPUTYPE from command line and > make.conf (so should be equivalent with nocona) didn't help. > > *Clean build with commenting out WITH_KERNEL_RETPOLINE line > and WITH_RETPOLINE line in src.conf didn't help. > > *Combination of the above two didn't help, too (at r364788). > > *There are two root pools in different physical drive. > stable/12 on nvd0 (primary) and head on ada0 (secondary). > > *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC) > kernel. > > -- > Tomoaki AOKI <junchoon@dec.sakura.ne.jp> -- Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200906180240.e61a2869b1258f96c3e7d398>