Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Apr 2022 09:49:27 +0200
From:      Roger Pau =?utf-8?B?TW9ubsOp?= <roger.pau@citrix.com>
To:        Ze Dupsys <zedupsys@gmail.com>
Cc:        <freebsd-xen@freebsd.org>, <buhrow@nfbcal.org>
Subject:   Re: ZFS + FreeBSD XEN dom0 panic
Message-ID:  <YlfSB3mCVZiy5dpI@Air-de-Roger>
In-Reply-To: <2dbf24f9-2bc8-a8d5-e31f-90ec2c4b64c5@gmail.com>
References:  <Yj16hdrxawD61mAL@Air-de-Roger> <639f7ce0-8a07-884c-c1cf-8257b9f3d9e8@gmail.com> <Yj7YrW9CG2aXT%2BiC@Air-de-Roger> <4da2302b-0745-ea1d-c868-5a8a5fc66b18@gmail.com> <Yj8lZWqeHbD%2BkfOQ@Air-de-Roger> <48b74c39-abb3-0a3e-91a8-b5ab1e1223ce@gmail.com> <YkAqxjiMM1M1QdgR@Air-de-Roger> <22643831-70d3-5a3e-f973-fb80957e80dc@gmail.com> <Ykxev3fangqRGQcn@Air-de-Roger> <2dbf24f9-2bc8-a8d5-e31f-90ec2c4b64c5@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 14, 2022 at 10:20:25AM +0300, Ze Dupsys wrote:
> On 2022.04.05. 18:22, Roger Pau Monné wrote:
> > I've pushed the changes to:
> > 
> > http://xenbits.xen.org/gitweb/?p=people/royger/freebsd.git;a=shortlog;h=refs/heads/for-leak
> > 
> > (This is on top of main branch).
> > 
> > I'm also attaching the two patches on this email.
> > 
> > Let me know if those make a difference to stabilize the system.
> > 
> 
> I do not know should i start a new thread, but i have captured another
> panic, new trace, this is on different machine, similar setup, RELEASE-13.0
> + 2 mentioned patches.
> 
> I do not know how to reliably repeat it, nor the cause. But i have suspicion
> that this happens when doing some of steps like: create new ZVOL, turn one
> VM off, add new HDD/ZVOL path to VM in cfg file, start VM back up, inside
> this VM do some HDD load on newly added HDD (install stuff, extract data,
> etc.) + something of: shut all VMs down one by one, then do init 0 or 6, or
> create new other VM. On this machine i can't experiment too much, no serial
> output available either.

So you haven't seen this panic with the 3rd patch applied?

I guess that's also possible because when testing the 3rd patch you
are using a HEAD kernel rather than stable/13, so the ZFS code might
have changed.

> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 3; apic id = 06
> fault virtual address   = 0x68
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff821dc99d
> stack pointer           = 0x28:0xfffffe00c6b497d0
> frame pointer           = 0x28:0xfffffe00c6b49870
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 0 (xbbd26 taskq)
> trap number             = 12
> panic: page fault
> cpuid = 3
> time = 1649915274
> KDB: stack backtrace:
> #0 0xffffffff80c57385 at kdb_backtrace+0x65
> #1 0xffffffff80c09d61 at vpanic+0x181
> #2 0xffffffff80c09bd3 at panic+0x43
> #3 0xffffffff8108b187 at trap+0xbc7
> #4 0xffffffff8108b1df at trap+0xc1f
> #5 0xffffffff8108a83d at trap+0x27d
> #6 0xffffffff81061818 at calltrap+0x8
> #7 0xffffffff821c035a at dmu_read+0x2a
> #8 0xffffffff8218da3a at zvol_geom_bio_strategy+0x2aa
> #9 0xffffffff80a7f074 at xbd_instance_create+0xa3d4
> #10 0xffffffff80a7b00a at xbd_instance_create+0x636a
> #11 0xffffffff80c6b021 at taskqueue_run+0x2a1
> #12 0xffffffff80c6c33c at taskqueue_thread_loop+0xac
> #13 0xffffffff80bc7c9e at fork_exit+0x7e
> #14 0xffffffff8106289e at fork_trampoline+0xe
> Uptime: 24m0s
> (ada0:ahcich0:0:0:0): spin-down
> (ada1:ahcich1:0:0:0): spin-down
> (ada2:ahcich2:0:0:0): spin-down
> Dumping 2922 out of 6104
> 
> 
> 
> cat panic.log| sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e
> /usr/lib/debug/boot/kernel/kernel.debug
> /usr/src/sys/kern/subr_bus.c:2410
> /usr/src/sys/kern/kern_racct.c:632
> /usr/src/sys/kern/kern_racct.c:617
> /usr/src/sys/dev/isci/isci_sysctl.c:92
> /usr/src/sys/dev/isci/isci_sysctl.c:0
> /usr/src/sys/dev/isci/isci_oem_parameters.c:130
> /usr/src/sys/dev/hyperv/input/hv_kbd.c:540
> ??:0
> ??:0
> /usr/src/sys/dev/xen/blkback/blkback.c:3083
> /usr/src/sys/xen/xenbus/xenbusvar.h:96
> /usr/src/sys/kern/subr_kobj.c:145
> /usr/src/sys/kern/subr_module.c:255
> /usr/src/sys/kern/kern_event.c:0
> /usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1158
> 
> 
> Full output of (kgdb) backtrace
> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  doadump (textdump=<optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:399
> #2  0xffffffff80c09956 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:486
> #3  0xffffffff80c09dd0 in vpanic (fmt=<optimized out>, ap=<optimized out>)
> at /usr/src/sys/kern/kern_shutdown.c:919
> #4  0xffffffff80c09bd3 in panic (fmt=<unavailable>) at
> /usr/src/sys/kern/kern_shutdown.c:843
> #5  0xffffffff8108b187 in trap_fatal (frame=0xfffffe00c6b49710, eva=104) at
> /usr/src/sys/amd64/amd64/trap.c:915
> #6  0xffffffff8108b1df in trap_pfault (frame=frame@entry=0xfffffe00c6b49710,
> usermode=false, signo=<optimized out>, signo@entry=0x0, ucode=<optimized
> out>, ucode@entry=0x0) at /usr/src/sys/amd64/amd64/trap.c:732
> #7  0xffffffff8108a83d in trap (frame=0xfffffe00c6b49710) at
> /usr/src/sys/amd64/amd64/trap.c:398
> #8  <signal handler called>
> #9  0xffffffff821dc99d in dbuf_write_children_ready (zio=<optimized out>,
> buf=<optimized out>, vdb=0x0) at
> /usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:4642

If this trace is correct the error is from passing vdb == NULL to
dbuf_write_children_ready():

https://cgit.freebsd.org/src/tree/sys/contrib/openzfs/module/zfs/dbuf.c?h=stable/13#n4551

The function will unconditionally dereference (v)db, so passing NULL
will trigger a page fault.

I have no idea however how can you get to this state.  Might be worth
posting the trace to freebsd-fs@freebsd.org in order to get some
feedback from the ZFS people.  It's possible the issue is with
blkback, but I would benefit from some help about what's wrong with
the data I'm providing to d_strategy.

Please Cc me on the email if you send to freebsd-fs@.

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YlfSB3mCVZiy5dpI>