Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Sep 2020 22:03:01 +0900
From:      Tomoaki AOKI <junchoon@dec.sakura.ne.jp>
To:        freebsd-current@freebsd.org
Cc:        mmacy@FreeBSD.org
Subject:   Fatal trap 18 on boot after OpenZFS import
Message-ID:  <20200904220301.7fac6b4008f1bc7ad8d803c9@dec.sakura.ne.jp>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.

--Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Hi.

Encountering boot failure with fatal trap 18 on boot,
happening at (maybe) just before init() starts. Possibly on
root remount by kernel or zpool import by rc.d script.
The last revision tried is r365316 (r364788 is the last tried
clean rebuild).

The last health revision is r364744, just before actual switch
to OpenZFS. amd64 on ThinkPad P52 (Core i7-8750H) w/descrete nvidia GPU.

r364751 with diff of r364777 and r364788 (to successfully built
Without unrelated-to-OpenZFS changes) fails.

Any suggestions and fixes are appreciated.


Trap screen is something like below (text attached),
typed up from relatively clear photo, so could be some typo.

This is shown just after usual kernel startup outputs.
boot1.efi (as EFI/bootx64.efi on ESP) starts /boot/loader.efi
properly, and loader.efi seems to boot kernel properly.

As even single user shell selection doesn't appear, loader.efi
is of r364744. But they works even if I proceeded irregular
process,

  1)Update src tree
  2)Clean obj tree
  3)buildworld
  4)etcupdate -p
  5)buildkernel
  6)installkernel
  7)shutdown to single user WITHOUT reboot  <- Irregular!
  8)installworld
  9)etcupdate
 10)rebuild src/sys-dependent ports (kmods, nvidia-driver, ...)
 11)reboot

loader.efi looks doing its job and panics after kernel startup ends.
Needless to say, rolling back to r364744 state from stable/12 on nvd0
Fixes the issue.

Regards.

=====

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer     = 0x20:0xffffffff82bfa320
stack pointer           = 0x28:0xfffffe00e20c6900
frame pointer           = 0x28:0xfffffe00e20c6960
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 27 (vdev_open)
trap number             = 18
panic: integer divide fault
cpuid = 2
time = 16
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe00e20c6610 vpanic() at vpanic+0x182/frame fffffe00e20c6660
panic() at panic+0x43/frame fffffe00e20c66c0
trap_fatal() at trap_fatal+0x387/frame fffffe00e20c6720
trap() at trap+0x8e/frame fffffe00e20c6830
calltrap() at calltrap+0x8/frame fffffe00e20c6830
--- trap 0x12, rip = 0xffffffff82bfa320, rsp = 0xfffffe00e20c6900, rbp
= 0xfffffe00e20c6960 --- zio_wait() at zio_wait+0x60/frame
0xfffffe00e20c6960 vdev_open() at vdev_open+0x74d/frame
0xfffffe00e20c69c0 vdev_open_child() at vdev_open_child+0x1e/frame
0xfffffe00e20c69e0 taskq_run() at taskq_run+0x1f/frame
0xfffffe00e20c6a00 taskqueue_run_locked() at
taskqueue_run_locked+0x181/frame 0xfffffe00e20c6a80
taskqueue_thread_loop() at taskqueue_thread_loop+0x118/frame
0xfffffe00e20c6ab0 fork_exit() at fork_exit+0x7d/frame
0xfffffe00e20c6af0 fork_trampoline() at fork_trampoline+0xe/frame
0xfffffe00e20c6af0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 27 tid 100570 ]
Stopped at      kdb_enter+0x37: movq    $0,0x1091556(%rip)
db> 

=====

Additional info:
 *Clean build with killing CPUTYPE from command line and
  make.conf (so should be equivalent with nocona) didn't help.

 *Clean build with commenting out WITH_KERNEL_RETPOLINE line
  and WITH_RETPOLINE line in src.conf didn't help.

 *Combination of the above two didn't help, too (at r364788).

 *There are two root pools in different physical drive.
  stable/12 on nvd0 (primary) and head on ada0 (secondary).

 *GENERIC-NODEBUG based (added options CAM_IOSCHED_DYNAMIC)
  kernel.

-- 
Tomoaki AOKI    <junchoon@dec.sakura.ne.jp>

--Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9
Content-Type: application/octet-stream;
 name="Fatal_trap_18_on_head_after_r364744.log"
Content-Disposition: attachment;
 filename="Fatal_trap_18_on_head_after_r364744.log"
Content-Transfer-Encoding: base64

RmF0YWwgdHJhcCAxODogaW50ZWdlciBkaXZpZGUgZmF1bHQgd2hpbGUgaW4ga2VybmVsIG1vZGUK
Y3B1aWQgPSAyOyBhcGljIGlkID0gMDIKaW5zdHJ1Y3Rpb24gcG9pbnRlciAgICAgPSAweDIwOjB4
ZmZmZmZmZmY4MmJmYTMyMApzdGFjayBwb2ludGVyICAgICAgICAgICA9IDB4Mjg6MHhmZmZmZmUw
MGUyMGM2OTAwCmZyYW1lIHBvaW50ZXIgICAgICAgICAgID0gMHgyODoweGZmZmZmZTAwZTIwYzY5
NjAKY29kZSBzZWdtZW50ICAgICAgICAgICAgPSBiYXNlIDB4MCwgbGltaXQgMHhmZmZmZiwgdHlw
ZSAweDFiCiAgICAgICAgICAgICAgICAgICAgICAgID0gRFBMIDAsIHByZXMgMSwgbG9uZyAxLCBk
ZWYzMiAwLCBncmFuIDEKcHJvY2Vzc29yIGVmbGFncyAgICAgICAgPSBpbnRlcnJ1cHQgZW5hYmxl
ZCwgcmVzdW1lLCBJT1BMID0gMApjdXJyZW50IHByb2Nlc3MgICAgICAgICA9IDI3ICh2ZGV2X29w
ZW4pCnRyYXAgbnVtYmVyICAgICAgICAgICAgID0gMTgKcGFuaWM6IGludGVnZXIgZGl2aWRlIGZh
dWx0CmNwdWlkID0gMgp0aW1lID0gMTYKS0RCOiBzdGFjayBiYWNrdHJhY2U6CmRiX3RyYWNlX3Nl
bGZfd3JhcHBlcigpIGF0IGRiX3RyYWNlX3NlbGZfd3JhcHBlcisweDJiL2ZyYW1lIDB4ZmZmZmZl
MDBlMjBjNjYxMAp2cGFuaWMoKSBhdCB2cGFuaWMrMHgxODIvZnJhbWUgZmZmZmZlMDBlMjBjNjY2
MApwYW5pYygpIGF0IHBhbmljKzB4NDMvZnJhbWUgZmZmZmZlMDBlMjBjNjZjMAp0cmFwX2ZhdGFs
KCkgYXQgdHJhcF9mYXRhbCsweDM4Ny9mcmFtZSBmZmZmZmUwMGUyMGM2NzIwCnRyYXAoKSBhdCB0
cmFwKzB4OGUvZnJhbWUgZmZmZmZlMDBlMjBjNjgzMApjYWxsdHJhcCgpIGF0IGNhbGx0cmFwKzB4
OC9mcmFtZSBmZmZmZmUwMGUyMGM2ODMwCi0tLSB0cmFwIDB4MTIsIHJpcCA9IDB4ZmZmZmZmZmY4
MmJmYTMyMCwgcnNwID0gMHhmZmZmZmUwMGUyMGM2OTAwLCByYnAgPSAweGZmZmZmZTAwZTIwYzY5
NjAgLS0tCnppb193YWl0KCkgYXQgemlvX3dhaXQrMHg2MC9mcmFtZSAweGZmZmZmZTAwZTIwYzY5
NjAKdmRldl9vcGVuKCkgYXQgdmRldl9vcGVuKzB4NzRkL2ZyYW1lIDB4ZmZmZmZlMDBlMjBjNjlj
MAp2ZGV2X29wZW5fY2hpbGQoKSBhdCB2ZGV2X29wZW5fY2hpbGQrMHgxZS9mcmFtZSAweGZmZmZm
ZTAwZTIwYzY5ZTAKdGFza3FfcnVuKCkgYXQgdGFza3FfcnVuKzB4MWYvZnJhbWUgMHhmZmZmZmUw
MGUyMGM2YTAwCnRhc2txdWV1ZV9ydW5fbG9ja2VkKCkgYXQgdGFza3F1ZXVlX3J1bl9sb2NrZWQr
MHgxODEvZnJhbWUgMHhmZmZmZmUwMGUyMGM2YTgwCnRhc2txdWV1ZV90aHJlYWRfbG9vcCgpIGF0
IHRhc2txdWV1ZV90aHJlYWRfbG9vcCsweDExOC9mcmFtZSAweGZmZmZmZTAwZTIwYzZhYjAKZm9y
a19leGl0KCkgYXQgZm9ya19leGl0KzB4N2QvZnJhbWUgMHhmZmZmZmUwMGUyMGM2YWYwCmZvcmtf
dHJhbXBvbGluZSgpIGF0IGZvcmtfdHJhbXBvbGluZSsweGUvZnJhbWUgMHhmZmZmZmUwMGUyMGM2
YWYwCi0tLSB0cmFwIDAsIHJpcCA9IDAsIHJzcCA9IDAsIHJicCA9IDAgLS0tCktEQjogZW50ZXI6
IHBhbmljClsgdGhyZWFkIHBpZCAyNyB0aWQgMTAwNTcwIF0KU3RvcHBlZCBhdCAgICAgIGtkYl9l
bnRlcisweDM3OiBtb3ZxICAgICQwLDB4MTA5MTU1NiglcmlwKQpkYj4gCg==

--Multipart=_Fri__4_Sep_2020_22_03_01_+0900_B7zojx0.zqE3+2O9--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200904220301.7fac6b4008f1bc7ad8d803c9>