From owner-freebsd-current@freebsd.org Sun Sep 20 13:06:55 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E619B3E071F for ; Sun, 20 Sep 2020 13:06:55 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from dec.sakura.ne.jp (dec.sakura.ne.jp [210.188.226.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4BvSWz16Qjz4WY0 for ; Sun, 20 Sep 2020 13:06:54 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from kalamity.joker.local (115-38-187-204.shizuoka1.commufa.jp [115.38.187.204]) (authenticated bits=0) by dec.sakura.ne.jp (8.15.2/8.15.2/[SAKURA-WEB]/20080708) with ESMTPA id 08KD6qCU042470 for ; Sun, 20 Sep 2020 22:06:52 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Sun, 20 Sep 2020 22:06:52 +0900 From: Tomoaki AOKI To: freebsd-current@freebsd.org Subject: Re: Fatal trap 18 on boot after OpenZFS import Message-Id: <20200920220652.34aa1ea0031c337e1790473b@dec.sakura.ne.jp> In-Reply-To: <20200906180240.e61a2869b1258f96c3e7d398@dec.sakura.ne.jp> References: <20200904220301.7fac6b4008f1bc7ad8d803c9@dec.sakura.ne.jp> <20200906180240.e61a2869b1258f96c3e7d398@dec.sakura.ne.jp> Reply-To: junchoon@dec.sakura.ne.jp Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; amd64-portbld-freebsd12.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4BvSWz16Qjz4WY0 X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.32 / 15.00]; HAS_REPLYTO(0.00)[junchoon@dec.sakura.ne.jp]; RCVD_VIA_SMTP_AUTH(0.00)[]; MV_CASE(0.50)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; TO_DN_NONE(0.00)[]; HAS_ORG_HEADER(0.00)[]; NEURAL_HAM_SHORT(-0.56)[-0.561]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:9370, ipnet:210.188.224.0/19, country:JP]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[115.38.187.204:received]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.49)[0.486]; RCPT_COUNT_ONE(0.00)[1]; DMARC_NA(0.00)[sakura.ne.jp]; NEURAL_SPAM_LONG(0.99)[0.991]; MIME_TRACE(0.00)[0:+]; R_SPF_NA(0.00)[no SPF record]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Sep 2020 13:06:56 -0000 Forgot to mention here. As I already mentioned on bugzilla, this problem is fixed at r365894. Thanks again, Ryan and Matthew! On Sun, 6 Sep 2020 18:02:40 +0900 Tomoaki AOKI wrote: > Filed PR. > Bug 249147 - [ZFS][Panic]Fatal trap 18 on boot after OpenZFS import > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249147 > > > On Fri, 4 Sep 2020 22:03:01 +0900 > Tomoaki AOKI wrote: > > > Hi. > > > > Encountering boot failure with fatal trap 18 on boot, > > happening at (maybe) just before init() starts. Possibly on > > root remount by kernel or zpool import by rc.d script. > > The last revision tried is r365316 (r364788 is the last tried > > clean rebuild). > > > > The last health revision is r364744, just before actual switch > > to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU. > > > > r364751 with diff of r364777 and r364788 (to successfully built > > Without unrelated-to-OpenZFS changes) fails. > > > > Any suggestions and fixes are appreciated. > > > > > > Trap screen is something like below (text attached), > > typed up from relatively clear photo, so could be some typo. > > > > This is shown just after usual kernel startup outputs. > > boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi > > properly, and loader.efi seems to boot kernel properly. > > > > As even single user shell selection doesn't appear, loader.efi > > is of r364744. But they works even if I proceeded irregular > > process, > > > > 1)Update src tree > > 2)Clean obj tree > > 3)buildworld > > 4)etcupdate -p > > 5)buildkernel > > 6)installkernel > > 7)shutdown to single user WITHOUT reboot <- Irregular! > > 8)installworld > > 9)etcupdate > > 10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...) > > 11)reboot > > > > loader.efi looks doing its job and panics after kernel startup ends. > > Needless to say, rolling back to r364744 state from stable/12 on nvd0 > > Fixes the issue. > > > > Regards. > > > > ===== > > > > Fatal trap 18: integer divide fault while in kernel mode > > cpuid = 2; apic id = 02 > > instruction pointer = 0x20:0xffffffff82bfa320 > > stack pointer = 0x28:0xfffffe00e20c6900 > > frame pointer = 0x28:0xfffffe00e20c6960 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 27 (vdev_open) > > trap number = 18 > > panic: integer divide fault > > cpuid = 2 > > time = 16 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfffffe00e20c6610 vpanic() at vpanic+0x182/frame fffffe00e20c6660 > > panic() at panic+0x43/frame fffffe00e20c66c0 > > trap_fatal() at trap_fatal+0x387/frame fffffe00e20c6720 > > trap() at trap+0x8e/frame fffffe00e20c6830 > > calltrap() at calltrap+0x8/frame fffffe00e20c6830 > > --- trap 0x12, rip = 0xffffffff82bfa320, rsp = 0xfffffe00e20c6900, rbp > > = 0xfffffe00e20c6960 --- zio_wait() at zio_wait+0x60/frame > > 0xfffffe00e20c6960 vdev_open() at vdev_open+0x74d/frame > > 0xfffffe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame > > 0xfffffe00e20c69e0 taskq_run() at taskq_run+0x1f/frame > > 0xfffffe00e20c6a00 taskqueue_run_locked() at > > taskqueue_run_locked+0x181/frame 0xfffffe00e20c6a80 > > taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame > > 0xfffffe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame > > 0xfffffe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame > > 0xfffffe00e20c6af0 > > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > KDB: enter: panic > > [ thread pid 27 tid 100570 ] > > Stopped at kdb_enter+0x37: movq $0,0x1091556(%rip) > > db> > > > > ===== > > > > Additional info: > > *Clean build with killing CPUTYPE from command line and > > make.conf (so should be equivalent with nocona) didn't help. > > > > *Clean build with commenting out WITH_KERNEL_RETPOLINE line > > and WITH_RETPOLINE line in src.conf didn't help. > > > > *Combination of the above two didn't help, too (at r364788). > > > > *There are two root pools in different physical drive. > > stable/12 on nvd0 (primary) and head on ada0 (secondary). > > > > *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC) > > kernel. > > > > -- > > Tomoaki AOKI > > > -- > Tomoaki AOKI > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- Tomoaki AOKI