From nobody Sun Mar 27 17:42:23 2022 X-Original-To: freebsd-xen@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B22071A4E0D9 for ; Sun, 27 Mar 2022 17:43:23 +0000 (UTC) (envelope-from zedupsys@gmail.com) Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KRNTk5XpRz3srJ for ; Sun, 27 Mar 2022 17:43:22 +0000 (UTC) (envelope-from zedupsys@gmail.com) Received: by mail-pg1-x529.google.com with SMTP id b130so9081492pga.13 for ; Sun, 27 Mar 2022 10:43:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=5wVN5/7B0s9ldZcGbVzg1Z+JYnJm3dRNi2jZ1mgq3JQ=; b=Wj+ewkoSOq2DsK8/QXZxBzZNV4U6HTZrWdW1hwe8MLmt/p44DDzkBrY+NfUlZaSpv8 euUuB54D85yc79nqZxrWaApRSsMyRkdRtaPhZl5udpYIxovyuXgjKRkwLNsPaZHm8vlx GS9j5Kovv4yVA/peR5ix6bHV6H5m1ro15R2Um9rbtLH0dkN6qRaOklTicLho9Bx4hOPz jbs6ZGIiHXCciWwNOcNm5JCeFO+YNisq0lKKUf1RkoqVZsY2Upsusvi90GUhcBW/vspv 2QMf8cATlchOB2DH8Ry+mcKKao9zZNY3WXQOyf73C8QWsP8QdHDsUkSgcO+YXcyPPNbl H+pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=5wVN5/7B0s9ldZcGbVzg1Z+JYnJm3dRNi2jZ1mgq3JQ=; b=IC+4Hir91wwSFe14YhRxEqc82fKueOoERKqeUw0qKA8xYt0qLcv64Gh84PhT6X3BQa Fp3bV+inty8n5XnPJGr9O1f1zZAnr5kuPXuvLDCZcURt0a8XhlTJbBw6XxrzQs5oq48A L2lBKHxIbSxKfsMxnQQ8aKzDThTtvxFMRgb+QcfjWt6fJebqmUxli2N7OqM0k3mJm8gg H9+keHyvTDO08Q11KVvKnfZtK6n45fl3+GEv1HsYK6nkSxuVJjbkEv7XNJgTm7gHlyBe umcx8J70OBHz1ifN/8KoPkLgdmnEeRa7e1C7O89CBRN3/BlnU0iW0XWGCaoDyKZNkUzq TVSA== X-Gm-Message-State: AOAM5301OnnugNZWbu8lw/lrhqArcBG5RDeZnUD7Vu9YtCVl+g610i/u JI/wW6OCRNxFUd8d3TXMS6c8PiLKPv4m6Pp2xkg= X-Google-Smtp-Source: ABdhPJw0qaZTCDMwRKvVy28BQtnWDUPlaVJJP4K+mZ/+/HwdP/53EtlioZCMrCImRh1O0w1+gGDTC0QGUNO3emeugpM= X-Received: by 2002:a63:2a04:0:b0:398:5225:8982 with SMTP id q4-20020a632a04000000b0039852258982mr554840pgq.394.1648402995517; Sun, 27 Mar 2022 10:43:15 -0700 (PDT) List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-xen List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-xen@freebsd.org X-BeenThere: freebsd-xen@freebsd.org MIME-Version: 1.0 References: <088c8222-063a-1db5-da83-a5a0168d66c6@gmail.com> <639f7ce0-8a07-884c-c1cf-8257b9f3d9e8@gmail.com> <4da2302b-0745-ea1d-c868-5a8a5fc66b18@gmail.com> <48b74c39-abb3-0a3e-91a8-b5ab1e1223ce@gmail.com> In-Reply-To: From: Ze Dupsys Date: Sun, 27 Mar 2022 20:42:23 +0300 Message-ID: Subject: Re: ZFS + FreeBSD XEN dom0 panic To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= Cc: freebsd-xen@freebsd.org, Brian Buhrow Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4KRNTk5XpRz3srJ X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=Wj+ewkoS; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of zedupsys@gmail.com designates 2607:f8b0:4864:20::529 as permitted sender) smtp.mailfrom=zedupsys@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-xen@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::529:from]; NEURAL_HAM_SHORT(-1.00)[-0.996]; MLMMJ_DEST(0.00)[freebsd-xen]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-ThisMailContainsUnwantedMimeParts: N On Sun, Mar 27, 2022 at 12:13 PM Roger Pau Monn=C3=A9 wrote: > > On Sun, Mar 27, 2022 at 12:38:00AM +0200, Ze Dupsys wrote: > > On 2022.03.26. 16:38, Roger Pau Monn=C3=A9 wrote: > > > .. > > > It's weird, because here you get a page fault, but there are also > > > traces with: > > > .. > > > general protection fault while in kernel mode > > > .. > > > That show a general protection fault instead of a page fault. > > > > Yes indeed, i had not noticed this. Grepped across 34 stored panic log > > files, i see that 28 are page fault, 4 are general protection fault, 2 > > other. I though maybe RAM size influences this, but page faults have 2G= , 4G, > > 6G, 8G Dom0, general protection faults have 2G, 4G, 8G. > > > > I have no idea what triggers what, since stress tests and command line = args > > are more or less the same. Builds are different with patches, some debu= g > > info, etc. Almost all panic traces have "rman_is_region_manager" in mid= , > > actually looking all of them together seemed interesting. I'll attach u= nique > > panic traces, since some included snprintf, kvprintf as well, maybe hel= pful. > > Unfortunately i do not know which version and what patches were applied= . > > > > > > > I've also noticed it seems to always be 'devmatch' the process that > > > triggers the panic. > > > > Yes, it seems to be the case most of the time. There are 3 cases when > > process is "xbbd* taskq". 2 cases with 2G RAM, 1 with 6G. > > > > > > > I've been able to get a better trace with gdb and your debug symbols, > > > and this is: > > > > > > (gdb) info line *0xffffffff80c6a2b2 > > > Line 1386 of "/usr/src/sys/kern/subr_bus.c" starts at address > > 0xffffffff80c6a2b2 > > > and ends at 0xffffffff80c6a2b6 . > > > (gdb) info line *0xffffffff80c86ed1 > > > Line 1052 of "/usr/src/sys/kern/subr_rman.c" starts at address > > 0xffffffff80c86ecc > > > and ends at 0xffffffff80c86ed5 . > > > > This is a nice find! > > > > > > > I'm trying to figure out how the device could be removed or > > > disconnected from the rman. I will try to create a patch to catch the > > > device that leaves rman regions when destroyed/removed. > > > > Okay, i'll apply when it will be possible. > > > > I did run xen-debug on system with applied blkback.patch as you sent in= next > > message to this. > > > > System had panic with new trace: > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 2; apic id =3D 04 > > fault virtual address =3D 0xa4 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff80c90ed0 > > stack pointer =3D 0x28:0xfffffe0051927ab0 > > frame pointer =3D 0x28:0xfffffe0051927ad0 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 16 (xenwatch) > > trap number =3D 12 > > panic: page fault > > cpuid =3D 1 > > time =3D 1648331592 > > KDB: stack backtrace: > > #0 0xffffffff80c7c275 at kdb_backtrace+0x65 > > #1 0xffffffff80c2e2d1 at vpanic+0x181 > > #2 0xffffffff80c2e143 at panic+0x43 > > #3 0xffffffff810c8b97 at trap+0xba7 > > #4 0xffffffff810c8bef at trap+0xbff > > #5 0xffffffff810c8243 at trap+0x253 > > #6 0xffffffff810a0838 at calltrap+0x8 > > #7 0xffffffff80a98515 at xbd_instance_create+0x7895 > > #8 0xffffffff80a98462 at xbd_instance_create+0x77e2 > > #9 0xffffffff80a9619b at xbd_instance_create+0x551b > > #10 0xffffffff80f95c54 at xenbusb_localend_changed+0x7c4 > > #11 0xffffffff80ab0ef4 at xs_unlock+0x704 > > #12 0xffffffff80beaede at fork_exit+0x7e > > #13 0xffffffff810a18ae at fork_trampoline+0xe > > > > cat /tmp/panic.log| sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2l= ine > > -e /usr/lib/debug/boot/kernel/kernel.debug > > > > /usr/src/sys/kern/subr_kdb.c:443 > > /usr/src/sys/kern/kern_shutdown.c:0 > > /usr/src/sys/kern/kern_shutdown.c:844 > > /usr/src/sys/amd64/amd64/trap.c:944 > > /usr/src/sys/amd64/amd64/trap.c:0 > > /usr/src/sys/amd64/amd64/trap.c:0 > > /usr/src/sys/amd64/amd64/exception.S:292 > > /usr/src/sys/dev/xen/blkback/blkback.c:2789 > > /usr/src/sys/dev/xen/blkback/blkback.c:3431 > > /usr/src/sys/dev/xen/blkback/blkback.c:3912 > > /usr/src/sys/xen/xenbus/xenbusb_back.c:238 > > /usr/src/sys/dev/xen/xenstore/xenstore.c:1007 > > /usr/src/sys/kern/kern_fork.c:1099 > > /usr/src/sys/amd64/amd64/exception.S:1091 > > Thanks, unfortunately that patch was incomplete. I have an updated > version that I think is better now, and I've slightly tested it > (creating and destroying a domain with it doesn't seem to crash). > Appended patch at the end of the message. > > > > > Full serial log in attachment. > > > > Thanks. > > > =3D=3D=3D=3D COUNT: 1 > > Fatal trap 9: general protection fault while in kernel mode > > cpuid =3D 0; apic id =3D 00 > > instruction pointer =3D 0x20:0xffffffff80c45892 > > stack pointer =3D 0x28:0xfffffe00d2d2b930 > > frame pointer =3D 0x28:0xfffffe00d2d2b930 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 10984 (devmatch) > > trap number =3D 9 > > panic: general protection fault > > cpuid =3D 0 > > time =3D 1646305680 > > KDB: stack backtrace: > > #0 0xffffffff80c57525 at kdb_backtrace+0x65 > > #1 0xffffffff80c09f01 at vpanic+0x181 > > #2 0xffffffff80c09d73 at panic+0x43 > > #3 0xffffffff8108b1a7 at trap+0xbc7 > > #4 0xffffffff8108a66e at trap+0x8e > > #5 0xffffffff81061b18 at calltrap+0x8 > > #6 0xffffffff80c62011 at rman_is_region_manager+0x241 > > #7 0xffffffff80c1a051 at sbuf_new_for_sysctl+0x101 > > #8 0xffffffff80c1949c at kernel_sysctl+0x43c > > #9 0xffffffff80c19b13 at userland_sysctl+0x173 > > #10 0xffffffff80c1995f at sys___sysctl+0x5f > > #11 0xffffffff8108baac at amd64_syscall+0x10c > > #12 0xffffffff8106243e at Xfast_syscall+0xfe > > Uptime: 1h15m46s > > > > > > =3D=3D=3D=3D COUNT: 3 > > Fatal trap 9: general protection fault while in kernel mode > > cpuid =3D 0; apic id =3D 00 > > instruction pointer =3D 0x20:0xffffffff80d0728f > > stack pointer =3D 0x28:0xfffffe00a17ea790 > > frame pointer =3D 0x28:0xfffffe00a17ea790 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 2785 (devmatch) > > trap number =3D 9 > > panic: general protection fault > > cpuid =3D 1 > > time =3D 1646419029 > > KDB: stack backtrace: > > #0 0xffffffff80c57525 at kdb_backtrace+0x65 > > #1 0xffffffff80c09f01 at vpanic+0x181 > > #2 0xffffffff80c09d73 at panic+0x43 > > #3 0xffffffff8108b1a7 at trap+0xbc7 > > #4 0xffffffff8108a66e at trap+0x8e > > #5 0xffffffff81061b18 at calltrap+0x8 > > #6 0xffffffff80c5da17 at kvprintf+0x1007 > > #7 0xffffffff80c5e719 at snprintf+0x59 > > #8 0xffffffff80c6204b at rman_is_region_manager+0x27b > > #9 0xffffffff80c1a051 at sbuf_new_for_sysctl+0x101 > > #10 0xffffffff80c1949c at kernel_sysctl+0x43c > > #11 0xffffffff80c19b13 at userland_sysctl+0x173 > > #12 0xffffffff80c1995f at sys___sysctl+0x5f > > #13 0xffffffff8108baac at amd64_syscall+0x10c > > #14 0xffffffff8106243e at Xfast_syscall+0xfe > > > Unique on "current process" and trace fingerprint #0-#*. > > > > =3D=3D=3D=3D COUNT: 23 > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 0; apic id =3D 00 > > fault virtual address =3D 0x22710028 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff80c45892 > > stack pointer =3D 0x28:0xfffffe0096600930 > > frame pointer =3D 0x28:0xfffffe0096600930 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 1496 (devmatch) > > trap number =3D 12 > > panic: page fault > > cpuid =3D 0 > > time =3D 1646123791 > > KDB: stack backtrace: > > #0 0xffffffff80c57525 at kdb_backtrace+0x65 > > #1 0xffffffff80c09f01 at vpanic+0x181 > > #2 0xffffffff80c09d73 at panic+0x43 > > #3 0xffffffff8108b1a7 at trap+0xbc7 > > #4 0xffffffff8108b1ff at trap+0xc1f > > #5 0xffffffff8108a85d at trap+0x27d > > #6 0xffffffff81061b18 at calltrap+0x8 > > #7 0xffffffff80c62011 at rman_is_region_manager+0x241 > > #8 0xffffffff80c1a051 at sbuf_new_for_sysctl+0x101 > > #9 0xffffffff80c1949c at kernel_sysctl+0x43c > > #10 0xffffffff80c19b13 at userland_sysctl+0x173 > > #11 0xffffffff80c1995f at sys___sysctl+0x5f > > #12 0xffffffff8108baac at amd64_syscall+0x10c > > #13 0xffffffff8106243e at Xfast_syscall+0xfe > > > > > > > > =3D=3D=3D=3D COUNT: 2 > > current process =3D 20284 (devmatch) > > trap number =3D 12 > > panic: page fault > > cpuid =3D 3 > > time =3D 1647247618 > > KDB: stack backtrace: > > #0 0xffffffff80c7c615 at kdb_backtrace+0x65 > > #1 0xffffffff80c2e621 at vpanic+0x181 > > #2 0xffffffff80c2e493 at panic+0x43 > > #3 0xffffffff810c8b97 at trap+0xba7 > > #4 0xffffffff810c8bef at trap+0xbff > > #5 0xffffffff810c8243 at trap+0x253 > > #6 0xffffffff810a09d8 at calltrap+0x8 > > #7 0xffffffff80c82c77 at kvprintf+0x1007 > > #8 0xffffffff80c83a09 at snprintf+0x59 > > #9 0xffffffff80c8729b at rman_is_region_manager+0x27b > > #10 0xffffffff80c3ee81 at sbuf_new_for_sysctl+0x101 > > #11 0xffffffff80c3e2cc at kernel_sysctl+0x3ec > > #12 0xffffffff80c3e943 at userland_sysctl+0x173 > > #13 0xffffffff80c3e78f at sys___sysctl+0x5f > > #14 0xffffffff810c949c at amd64_syscall+0x10c > > #15 0xffffffff810a12eb at Xfast_syscall+0xfb > > Thanks, those all seem to be related to a device being removed without > cleaning it's rman regions properly. So far I've spotted an issue in > blkback in this regard, but I wouldn't discard other issues in either > blkback or netback. Let's see if the updated blkback patch makes a > difference now. > > > =3D=3D=3D=3D COUNT: 2 > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 1; apic id =3D 02 > > fault virtual address =3D 0x68 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff824a599d > > stack pointer =3D 0x28:0xfffffe00b1e27910 > > frame pointer =3D 0x28:0xfffffe00b1e279b0 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 0 (xbbd7 taskq) > > trap number =3D 12 > > panic: page fault > > cpuid =3D 1 > > time =3D 1646122723 > > KDB: stack backtrace: > > #0 0xffffffff80c57525 at kdb_backtrace+0x65 > > #1 0xffffffff80c09f01 at vpanic+0x181 > > #2 0xffffffff80c09d73 at panic+0x43 > > #3 0xffffffff8108b1a7 at trap+0xbc7 > > #4 0xffffffff8108b1ff at trap+0xc1f > > #5 0xffffffff8108a85d at trap+0x27d > > #6 0xffffffff81061b18 at calltrap+0x8 > > #7 0xffffffff8248935a at dmu_read+0x2a > > #8 0xffffffff82456a3a at zvol_geom_bio_strategy+0x2aa > > #9 0xffffffff80a7f214 at xbd_instance_create+0xa394 > > #10 0xffffffff80a7b1ea at xbd_instance_create+0x636a > > #11 0xffffffff80c6b1c1 at taskqueue_run+0x2a1 > > #12 0xffffffff80c6c4dc at taskqueue_thread_loop+0xac > > #13 0xffffffff80bc7e3e at fork_exit+0x7e > > #14 0xffffffff81062b9e at fork_trampoline+0xe > > > > > > =3D=3D=3D=3D COUNT: 1 > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 1; apic id =3D 02 > > fault virtual address =3D 0x148 > > fault code =3D supervisor read data, page not present > > instruction pointer =3D 0x20:0xffffffff8248cef4 > > stack pointer =3D 0x28:0xfffffe009941d9a0 > > frame pointer =3D 0x28:0xfffffe009941d9a0 > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > current process =3D 0 (xbbd1 taskq) > > trap number =3D 12 > > panic: page fault > > cpuid =3D 1 > > time =3D 1646129773 > > KDB: stack backtrace: > > #0 0xffffffff80c57525 at kdb_backtrace+0x65 > > #1 0xffffffff80c09f01 at vpanic+0x181 > > #2 0xffffffff80c09d73 at panic+0x43 > > #3 0xffffffff8108b1a7 at trap+0xbc7 > > #4 0xffffffff8108b1ff at trap+0xc1f > > #5 0xffffffff8108a85d at trap+0x27d > > #6 0xffffffff81061b18 at calltrap+0x8 > > #7 0xffffffff825cb76e at zil_open+0xe > > #8 0xffffffff82456d02 at zvol_ensure_zilog+0xb2 > > #9 0xffffffff82456818 at zvol_geom_bio_strategy+0x88 > > #10 0xffffffff80a7f214 at xbd_instance_create+0xa394 > > #11 0xffffffff80a7b1ea at xbd_instance_create+0x636a > > #12 0xffffffff80c6b1c1 at taskqueue_run+0x2a1 > > #13 0xffffffff80c6c4dc at taskqueue_thread_loop+0xac > > #14 0xffffffff80bc7e3e at fork_exit+0x7e > > #15 0xffffffff81062b9e at fork_trampoline+0xe > > Hm, those last ones are in ZFS code, can you try to get the line > numbers for those? > > Maybe it's blkback providing bad data to the disk open functions. > > Since you are doing so much testing, it might make sense for you to > use a debug FreeBSD kernel rather than a production one (one with > WITNESS and INVARIANTS enabled). > > Thanks, Roger. > > ---8<--- > diff --git a/sys/dev/xen/blkback/blkback.c b/sys/dev/xen/blkback/blkback.= c > index 33414295bf5e..4007a93a54c7 100644 > --- a/sys/dev/xen/blkback/blkback.c > +++ b/sys/dev/xen/blkback/blkback.c > @@ -2774,19 +2774,12 @@ xbb_free_communication_mem(struct xbb_softc *xbb) > static int > xbb_disconnect(struct xbb_softc *xbb) > { > - struct gnttab_unmap_grant_ref ops[XBB_MAX_RING_PAGES]; > - struct gnttab_unmap_grant_ref *op; > - u_int ring_idx; > - int error; > - > DPRINTF("\n"); > > - if ((xbb->flags & XBBF_RING_CONNECTED) =3D=3D 0) > - return (0); > - > mtx_unlock(&xbb->lock); > xen_intr_unbind(&xbb->xen_intr_handle); > - taskqueue_drain(xbb->io_taskqueue, &xbb->io_task); > + if (xbb->io_taskqueue !=3D NULL) > + taskqueue_drain(xbb->io_taskqueue, &xbb->io_task); > mtx_lock(&xbb->lock); > > /* > @@ -2796,19 +2789,28 @@ xbb_disconnect(struct xbb_softc *xbb) > if (xbb->active_request_count !=3D 0) > return (EAGAIN); > > - for (ring_idx =3D 0, op =3D ops; > - ring_idx < xbb->ring_config.ring_pages; > - ring_idx++, op++) { > - op->host_addr =3D xbb->ring_config.gnt_addr > - + (ring_idx * PAGE_SIZE); > - op->dev_bus_addr =3D xbb->ring_config.bus_addr[ring_idx]; > - op->handle =3D xbb->ring_config.handle[ring_idx]; > - } > + if (xbb->flags & XBBF_RING_CONNECTED) { > + struct gnttab_unmap_grant_ref ops[XBB_MAX_RING_PAGES]; > + struct gnttab_unmap_grant_ref *op; > + unsigned int ring_idx; > + int error; > + > + for (ring_idx =3D 0, op =3D ops; > + ring_idx < xbb->ring_config.ring_pages; > + ring_idx++, op++) { > + op->host_addr =3D xbb->ring_config.gnt_addr > + + (ring_idx * PAGE_SIZE); > + op->dev_bus_addr =3D xbb->ring_config.bus_addr[ri= ng_idx]; > + op->handle =3D xbb->ring_config.handle[ring= _idx]; > + } > > - error =3D HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, ops= , > - xbb->ring_config.ring_pages); > - if (error !=3D 0) > - panic("Grant table op failed (%d)", error); > + error =3D HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_= ref, ops, > + xbb->ring_config.ring_p= ages); > + if (error !=3D 0) > + panic("Grant table op failed (%d)", error); > + > + xbb->flags &=3D ~XBBF_RING_CONNECTED; > + } > > xbb_free_communication_mem(xbb); > > @@ -2839,7 +2841,6 @@ xbb_disconnect(struct xbb_softc *xbb) > xbb->request_lists =3D NULL; > } > > - xbb->flags &=3D ~XBBF_RING_CONNECTED; > return (0); > } Hello, I applied given patch, i did not have enough time to test thoroughly, but for 3 hours system was running without panic whereas previously it would crash in around 1,5 hours in similar settings. Till Thursday i will not be able to test. About those ZFS panic traces, i will try to get line numbers, but the problem is that i do not have /usr/lib/debug/boot/kernel/kernel.debug for FreeBSD 13.0-RELEASE-p7. I tried on laptop's VirtualBox to set up 13.0-RELEASE, but freebsd-update now updates to -p10 version not -p7, and i did not find a way to to get -p7. It seems to be unsupported feature. What do you mean to use debug kernel with WITNESS and INVARIANTS? To build custom kernel GENERIC + add those two options or is there a common kernel build config used by devs that already includes those options? Thanks.