From owner-freebsd-current@freebsd.org Fri Sep 4 13:03:13 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9D0053C1A65 for ; Fri, 4 Sep 2020 13:03:13 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from dec.sakura.ne.jp (dec.sakura.ne.jp [210.188.226.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4BjdC36DGqz41l2; Fri, 4 Sep 2020 13:03:11 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from kalamity.joker.local (180-198-4-200.nagoya1.commufa.jp [180.198.4.200]) (authenticated bits=0) by dec.sakura.ne.jp (8.15.2/8.15.2/[SAKURA-WEB]/20080708) with ESMTPA id 084D32gn088363; Fri, 4 Sep 2020 22:03:02 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Fri, 4 Sep 2020 22:03:01 +0900 From: Tomoaki AOKI To: freebsd-current@freebsd.org Cc: mmacy@FreeBSD.org Subject: Fatal trap 18 on boot after OpenZFS import Message-Id: <20200904220301.7fac6b4008f1bc7ad8d803c9@dec.sakura.ne.jp> Reply-To: junchoon@dec.sakura.ne.jp Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; amd64-portbld-freebsd12.1) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9" X-Rspamd-Queue-Id: 4BjdC36DGqz41l2 X-Spamd-Bar: / X-Spamd-Result: default: False [0.45 / 15.00]; HAS_REPLYTO(0.00)[junchoon@dec.sakura.ne.jp]; RCVD_VIA_SMTP_AUTH(0.00)[]; MV_CASE(0.50)[]; HAS_ATTACHMENT(0.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; TO_DN_NONE(0.00)[]; HAS_ORG_HEADER(0.00)[]; NEURAL_HAM_SHORT(-0.17)[-0.170]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:9370, ipnet:210.188.224.0/19, country:JP]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[180.198.4.200:received]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.77)[-0.768]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.01)[-0.012]; MIME_GOOD(-0.10)[multipart/mixed,text/plain]; DMARC_NA(0.00)[sakura.ne.jp]; AUTH_NA(1.00)[]; R_SPF_NA(0.00)[no SPF record]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Sep 2020 13:03:13 -0000 This is a multi-part message in MIME format. --Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Hi. Encountering boot failure with fatal trap 18 on boot, happening at (maybe) just before init() starts. Possibly on root remount by kernel or zpool import by rc.d script. The last revision tried is r365316 (r364788 is the last tried clean rebuild). The last health revision is r364744, just before actual switch to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU. r364751 with diff of r364777 and r364788 (to successfully built Without unrelated-to-OpenZFS changes) fails. Any suggestions and fixes are appreciated. Trap screen is something like below (text attached), typed up from relatively clear photo, so could be some typo. This is shown just after usual kernel startup outputs. boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi properly, and loader.efi seems to boot kernel properly. As even single user shell selection doesn't appear, loader.efi is of r364744. But they works even if I proceeded irregular process, 1)Update src tree 2)Clean obj tree 3)buildworld 4)etcupdate -p 5)buildkernel 6)installkernel 7)shutdown to single user WITHOUT reboot <- Irregular! 8)installworld 9)etcupdate 10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...) 11)reboot loader.efi looks doing its job and panics after kernel startup ends. Needless to say, rolling back to r364744 state from stable/12 on nvd0 Fixes the issue. Regards. ===== Fatal trap 18: integer divide fault while in kernel mode cpuid = 2; apic id = 02 instruction pointer = 0x20:0xffffffff82bfa320 stack pointer = 0x28:0xfffffe00e20c6900 frame pointer = 0x28:0xfffffe00e20c6960 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 27 (vdev_open) trap number = 18 panic: integer divide fault cpuid = 2 time = 16 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e20c6610 vpanic() at vpanic+0x182/frame fffffe00e20c6660 panic() at panic+0x43/frame fffffe00e20c66c0 trap_fatal() at trap_fatal+0x387/frame fffffe00e20c6720 trap() at trap+0x8e/frame fffffe00e20c6830 calltrap() at calltrap+0x8/frame fffffe00e20c6830 --- trap 0x12, rip = 0xffffffff82bfa320, rsp = 0xfffffe00e20c6900, rbp = 0xfffffe00e20c6960 --- zio_wait() at zio_wait+0x60/frame 0xfffffe00e20c6960 vdev_open() at vdev_open+0x74d/frame 0xfffffe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame 0xfffffe00e20c69e0 taskq_run() at taskq_run+0x1f/frame 0xfffffe00e20c6a00 taskqueue_run_locked() at taskqueue_run_locked+0x181/frame 0xfffffe00e20c6a80 taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame 0xfffffe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame 0xfffffe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e20c6af0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 27 tid 100570 ] Stopped at kdb_enter+0x37: movq $0,0x1091556(%rip) db> ===== Additional info: *Clean build with killing CPUTYPE from command line and make.conf (so should be equivalent with nocona) didn't help. *Clean build with commenting out WITH_KERNEL_RETPOLINE line and WITH_RETPOLINE line in src.conf didn't help. *Combination of the above two didn't help, too (at r364788). *There are two root pools in different physical drive. stable/12 on nvd0 (primary) and head on ada0 (secondary). *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC) kernel. -- Tomoaki AOKI --Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9 Content-Type: application/octet-stream; name="Fatal_trap_18_on_head_after_r364744.log" Content-Disposition: attachment; filename="Fatal_trap_18_on_head_after_r364744.log" Content-Transfer-Encoding: base64 RmF0YWwgdHJhcCAxODogaW50ZWdlciBkaXZpZGUgZmF1bHQgd2hpbGUgaW4ga2VybmVsIG1vZGUK Y3B1aWQgPSAyOyBhcGljIGlkID0gMDIKaW5zdHJ1Y3Rpb24gcG9pbnRlciAgICAgPSAweDIwOjB4 ZmZmZmZmZmY4MmJmYTMyMApzdGFjayBwb2ludGVyICAgICAgICAgICA9IDB4Mjg6MHhmZmZmZmUw MGUyMGM2OTAwCmZyYW1lIHBvaW50ZXIgICAgICAgICAgID0gMHgyODoweGZmZmZmZTAwZTIwYzY5 NjAKY29kZSBzZWdtZW50ICAgICAgICAgICAgPSBiYXNlIDB4MCwgbGltaXQgMHhmZmZmZiwgdHlw ZSAweDFiCiAgICAgICAgICAgICAgICAgICAgICAgID0gRFBMIDAsIHByZXMgMSwgbG9uZyAxLCBk ZWYzMiAwLCBncmFuIDEKcHJvY2Vzc29yIGVmbGFncyAgICAgICAgPSBpbnRlcnJ1cHQgZW5hYmxl ZCwgcmVzdW1lLCBJT1BMID0gMApjdXJyZW50IHByb2Nlc3MgICAgICAgICA9IDI3ICh2ZGV2X29w ZW4pCnRyYXAgbnVtYmVyICAgICAgICAgICAgID0gMTgKcGFuaWM6IGludGVnZXIgZGl2aWRlIGZh dWx0CmNwdWlkID0gMgp0aW1lID0gMTYKS0RCOiBzdGFjayBiYWNrdHJhY2U6CmRiX3RyYWNlX3Nl bGZfd3JhcHBlcigpIGF0IGRiX3RyYWNlX3NlbGZfd3JhcHBlcisweDJiL2ZyYW1lIDB4ZmZmZmZl MDBlMjBjNjYxMAp2cGFuaWMoKSBhdCB2cGFuaWMrMHgxODIvZnJhbWUgZmZmZmZlMDBlMjBjNjY2 MApwYW5pYygpIGF0IHBhbmljKzB4NDMvZnJhbWUgZmZmZmZlMDBlMjBjNjZjMAp0cmFwX2ZhdGFs KCkgYXQgdHJhcF9mYXRhbCsweDM4Ny9mcmFtZSBmZmZmZmUwMGUyMGM2NzIwCnRyYXAoKSBhdCB0 cmFwKzB4OGUvZnJhbWUgZmZmZmZlMDBlMjBjNjgzMApjYWxsdHJhcCgpIGF0IGNhbGx0cmFwKzB4 OC9mcmFtZSBmZmZmZmUwMGUyMGM2ODMwCi0tLSB0cmFwIDB4MTIsIHJpcCA9IDB4ZmZmZmZmZmY4 MmJmYTMyMCwgcnNwID0gMHhmZmZmZmUwMGUyMGM2OTAwLCByYnAgPSAweGZmZmZmZTAwZTIwYzY5 NjAgLS0tCnppb193YWl0KCkgYXQgemlvX3dhaXQrMHg2MC9mcmFtZSAweGZmZmZmZTAwZTIwYzY5 NjAKdmRldl9vcGVuKCkgYXQgdmRldl9vcGVuKzB4NzRkL2ZyYW1lIDB4ZmZmZmZlMDBlMjBjNjlj MAp2ZGV2X29wZW5fY2hpbGQoKSBhdCB2ZGV2X29wZW5fY2hpbGQrMHgxZS9mcmFtZSAweGZmZmZm ZTAwZTIwYzY5ZTAKdGFza3FfcnVuKCkgYXQgdGFza3FfcnVuKzB4MWYvZnJhbWUgMHhmZmZmZmUw MGUyMGM2YTAwCnRhc2txdWV1ZV9ydW5fbG9ja2VkKCkgYXQgdGFza3F1ZXVlX3J1bl9sb2NrZWQr MHgxODEvZnJhbWUgMHhmZmZmZmUwMGUyMGM2YTgwCnRhc2txdWV1ZV90aHJlYWRfbG9vcCgpIGF0 IHRhc2txdWV1ZV90aHJlYWRfbG9vcCsweDExOC9mcmFtZSAweGZmZmZmZTAwZTIwYzZhYjAKZm9y a19leGl0KCkgYXQgZm9ya19leGl0KzB4N2QvZnJhbWUgMHhmZmZmZmUwMGUyMGM2YWYwCmZvcmtf dHJhbXBvbGluZSgpIGF0IGZvcmtfdHJhbXBvbGluZSsweGUvZnJhbWUgMHhmZmZmZmUwMGUyMGM2 YWYwCi0tLSB0cmFwIDAsIHJpcCA9IDAsIHJzcCA9IDAsIHJicCA9IDAgLS0tCktEQjogZW50ZXI6 IHBhbmljClsgdGhyZWFkIHBpZCAyNyB0aWQgMTAwNTcwIF0KU3RvcHBlZCBhdCAgICAgIGtkYl9l bnRlcisweDM3OiBtb3ZxICAgICQwLDB4MTA5MTU1NiglcmlwKQpkYj4gCg== --Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9--