Date: Thu, 3 Mar 2022 11:39:43 +0100 From: Roger Pau =?utf-8?B?TW9ubsOp?= <roger.pau@citrix.com> To: Ze Dupsys <zedupsys@gmail.com> Cc: <freebsd-xen@freebsd.org>, <buhrow@nfbcal.org> Subject: Re: ZFS + FreeBSD XEN dom0 panic Message-ID: <YiCa70%2BHQScsoaKX@Air-de-Roger> In-Reply-To: <CAOEWpzfsajhbvXfAw5-F1p83jjmSggobANBEyeYFAfiumAWRCA@mail.gmail.com> References: <CAOEWpzc2WVViMJHrrtuU-G_7yck4eehm6b=JQPSZU1MH-bzmiw@mail.gmail.com> <202203011540.221FeR4f028103@nfbcal.org> <CAOEWpzdC41ithfd7R_qa66%2Bsh_UXeku7OcVC_b%2BXUaLr_9SSTA@mail.gmail.com> <Yh93uLIBqk5NC2xf@Air-de-Roger> <CAOEWpzfsajhbvXfAw5-F1p83jjmSggobANBEyeYFAfiumAWRCA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 02, 2022 at 07:26:18PM +0200, Ze Dupsys wrote: > Today managed to crash lab Dom0 with: > xen_cmdline="dom0_mem=6144M dom0_max_vcpus=2 dom0=pvh,verbose=1 > console=vga,com1 com1=9600,8n1 guest_loglvl=all loglvl=all sync_console=1 > reboot=no" Hm, it's weird that reboot=no doesn't work for you. Does noreboot instead make a difference? > > I wrote ' vmstat -m | sort -k 2 -r' each 120 seconds, the latest one was as > in attachment, panic was with the same fingerprint as the one with > "rman_is_region_manager" line already reported. > The scripts i ran in parallel generally were the same as attached in bug > report, just a bit modified. > 1) ./libexec.sh zfs_volstress_fast_4g (this just creates new ZVOLs and > instead of 2GB, it writes 4BG in each ZVOL created dd if=/dev/zero) > 2) ./test_vm1_zvol_3gb.sh (this loops commands: start first DomU, write > 3GB in it's /tmp, restart DomU, removes /tmp, repeat) > 3) ./test_vm2_zvol_5_on_off.sh (this loops: start second DomU, which has 5 > disks attached, turn off DomU, repeat) Right. So the trigger for this seem to be related to creating (and destroying) VMs in a loop? Do you still see the same if you only execute steps 1 and 4 from the repro described above? > 4) monitoring, sleep 120 seconds, print vmstat | sort in serial output. > > Around dom id 108, system started to behave suspiciously, xl list showed > DomUs created, but they did not really start up, script timeout-ed for ssh > connection, no vnc. When i did xl destroy manually, and xl create, system > panic happened. Could you also add the output of `top -n1` to see where memory is going? I'm quite sure we have a leak in some of the backends, maybe the bounce buffer used by blkback. Thanks, Roger.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YiCa70%2BHQScsoaKX>