Date: Thu, 14 Apr 2022 13:06:50 +0300 From: Ze Dupsys <zedupsys@gmail.com> To: freebsd-fs@freebsd.org Cc: roger.pau@citrix.com Subject: ZFS, kernel panic due to unconditional NULL de-reference for (v)db Message-ID: <741ca49e-9ebd-be92-1389-4ab2227e6cb7@gmail.com>
next in thread | raw e-mail | index | archive | help
Hello everyone, In RELEASE-13.0 source, /usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:4456, dbuf_write_children_ready(zio_t *zio, arc_buf_t *buf, void *vdb) .. dmu_buf_impl_t *db = vdb; .. ASSERT3U(db->db_level, >, 0); .. for (i = 0, bp = db->db.db_data; i < 1ULL << epbs; i++, bp++) { If vdb == NULL, this function panics. And this is what kgdb backtrace shows. #9 0xffffffff821dc99d in dbuf_write_children_ready (zio=<optimized out>, buf=<optimized out>, vdb=0x0) at /usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:4642 We do not know the internals of ZFS and kgdb backtrace is somewhat imprecise, thus the question is, in which scenarios call to dbuf_write_children_ready could have vdb pointer set to NULL? Any hints, ideas? FWIW, more often than not this happens when machine is powered down, so maybe some data structure is "half-freed". The call into ZFS code happens through (*dev_data->csw->d_strategy)(bios[bio_idx]), at the moment we suspect that there might be something wrong with data provided to d_strategy, but have no clue what could cause vdb to be NULL in the code above. Panic info below. Thanks. Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 06 fault virtual address = 0x68 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff821dc99d stack pointer = 0x28:0xfffffe00c6b497d0 frame pointer = 0x28:0xfffffe00c6b49870 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (xbbd26 taskq) trap number = 12 panic: page fault cpuid = 3 time = 1649915274 KDB: stack backtrace: #0 0xffffffff80c57385 at kdb_backtrace+0x65 #1 0xffffffff80c09d61 at vpanic+0x181 #2 0xffffffff80c09bd3 at panic+0x43 #3 0xffffffff8108b187 at trap+0xbc7 #4 0xffffffff8108b1df at trap+0xc1f #5 0xffffffff8108a83d at trap+0x27d #6 0xffffffff81061818 at calltrap+0x8 #7 0xffffffff821c035a at dmu_read+0x2a #8 0xffffffff8218da3a at zvol_geom_bio_strategy+0x2aa #9 0xffffffff80a7f074 at xbd_instance_create+0xa3d4 #10 0xffffffff80a7b00a at xbd_instance_create+0x636a #11 0xffffffff80c6b021 at taskqueue_run+0x2a1 #12 0xffffffff80c6c33c at taskqueue_thread_loop+0xac #13 0xffffffff80bc7c9e at fork_exit+0x7e #14 0xffffffff8106289e at fork_trampoline+0xe cat panic.log| sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/kern/subr_bus.c:2410 /usr/src/sys/kern/kern_racct.c:632 /usr/src/sys/kern/kern_racct.c:617 /usr/src/sys/dev/isci/isci_sysctl.c:92 /usr/src/sys/dev/isci/isci_sysctl.c:0 /usr/src/sys/dev/isci/isci_oem_parameters.c:130 /usr/src/sys/dev/hyperv/input/hv_kbd.c:540 ??:0 ??:0 /usr/src/sys/dev/xen/blkback/blkback.c:3083 /usr/src/sys/xen/xenbus/xenbusvar.h:96 /usr/src/sys/kern/subr_kobj.c:145 /usr/src/sys/kern/subr_module.c:255 /usr/src/sys/kern/kern_event.c:0 /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1158 (kgdb) backtrace #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55 #1 doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xffffffff80c09956 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486 #3 0xffffffff80c09dd0 in vpanic (fmt=<optimized out>, ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:919 #4 0xffffffff80c09bd3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:843 #5 0xffffffff8108b187 in trap_fatal (frame=0xfffffe00c6b49710, eva=104) at /usr/src/sys/amd64/amd64/trap.c:915 #6 0xffffffff8108b1df in trap_pfault (frame=frame@entry=0xfffffe00c6b49710, usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:732 #7 0xffffffff8108a83d in trap (frame=0xfffffe00c6b49710) at /usr/src/sys/amd64/amd64/trap.c:398 #8 <signal handler called> #9 0xffffffff821dc99d in dbuf_write_children_ready (zio=<optimized out>, buf=<optimized out>, vdb=0x0) at /usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:4642 #10 0xffffffff821c035a in arc_evict_impl (state=<optimized out>, spa=<optimized out>, bytes=<optimized out>, type=<optimized out>) at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4377 #11 arc_evict_meta_balanced (meta_used=<optimized out>) at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4443 #12 arc_evict_meta (meta_used=<optimized out>) at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4533 #13 arc_evict () at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4627 #14 arc_evict_cb (arg=<optimized out>, zthr=<optimized out>) at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4938 #15 0xffffffff8218da3a in zfs_deleteextattr (ap=0x1430f6000) at /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:5592 #16 0xffffffff80a7f074 in xbb_dispatch_dev (xbb=0xfffff8011a6ff800, reqlist=<optimized out>, operation=<optimized out>, bio_flags=0) at /usr/src/sys/dev/xen/blkback/blkback.c:2207 #17 0xffffffff80a7b00a in xbb_dispatch_io (xbb=0xfffff8011a6ff800, reqlist=<optimized out>) at /usr/src/sys/dev/xen/blkback/blkback.c:1767 #18 xbb_run_queue (context=0xfffff8011a6ff800, pending=<optimized out>) at /usr/src/sys/dev/xen/blkback/blkback.c:1987 #19 0xffffffff80c6b021 in taskqueue_run_locked (queue=queue@entry=0xfffff8011a9f1e00) at /usr/src/sys/kern/subr_taskqueue.c:476 #20 0xffffffff80c6c33c in taskqueue_thread_loop (arg=<optimized out>, arg@entry=0xfffff8011a6ff800) at /usr/src/sys/kern/subr_taskqueue.c:793 #21 0xffffffff80bc7c9e in fork_exit (callout=0xffffffff80c6c290 <taskqueue_thread_loop>, arg=0xfffff8011a6ff800, frame=0xfffffe00c6b49c00) at /usr/src/sys/kern/kern_fork.c:1069 #22 <signal handler called>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?741ca49e-9ebd-be92-1389-4ab2227e6cb7>