Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Mar 2022 11:39:43 +0100
From:      Roger Pau =?utf-8?B?TW9ubsOp?= <roger.pau@citrix.com>
To:        Ze Dupsys <zedupsys@gmail.com>
Cc:        <freebsd-xen@freebsd.org>, <buhrow@nfbcal.org>
Subject:   Re: ZFS + FreeBSD XEN dom0 panic
Message-ID:  <YiCa70%2BHQScsoaKX@Air-de-Roger>
In-Reply-To: <CAOEWpzfsajhbvXfAw5-F1p83jjmSggobANBEyeYFAfiumAWRCA@mail.gmail.com>
References:  <CAOEWpzc2WVViMJHrrtuU-G_7yck4eehm6b=JQPSZU1MH-bzmiw@mail.gmail.com> <202203011540.221FeR4f028103@nfbcal.org> <CAOEWpzdC41ithfd7R_qa66%2Bsh_UXeku7OcVC_b%2BXUaLr_9SSTA@mail.gmail.com> <Yh93uLIBqk5NC2xf@Air-de-Roger> <CAOEWpzfsajhbvXfAw5-F1p83jjmSggobANBEyeYFAfiumAWRCA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 02, 2022 at 07:26:18PM +0200, Ze Dupsys wrote:
> Today managed to crash lab Dom0 with:
> xen_cmdline="dom0_mem=6144M dom0_max_vcpus=2 dom0=pvh,verbose=1
> console=vga,com1 com1=9600,8n1 guest_loglvl=all loglvl=all sync_console=1
> reboot=no"

Hm, it's weird that reboot=no doesn't work for you. Does noreboot
instead make a difference?

> 
> I wrote ' vmstat -m | sort -k 2 -r' each 120 seconds, the latest one was as
> in attachment, panic was with the same fingerprint as the one with
> "rman_is_region_manager" line already reported.



> The scripts i ran in parallel generally were the same as attached in bug
> report, just a bit modified.
> 1) ./libexec.sh zfs_volstress_fast_4g (this just creates new ZVOLs and
> instead of 2GB, it writes 4BG in each ZVOL created dd if=/dev/zero)
> 2)  ./test_vm1_zvol_3gb.sh (this loops commands: start first DomU, write
> 3GB in it's /tmp, restart DomU, removes /tmp, repeat)
> 3) ./test_vm2_zvol_5_on_off.sh (this loops: start second DomU, which has 5
> disks attached, turn off DomU, repeat)

Right. So the trigger for this seem to be related to creating (and
destroying) VMs in a loop?

Do you still see the same if you only execute steps 1 and 4 from the
repro described above?

> 4) monitoring, sleep 120 seconds, print vmstat | sort in serial output.
> 
> Around dom id 108, system started to behave suspiciously, xl list showed
> DomUs created, but they did not really start up, script timeout-ed for ssh
> connection, no vnc. When i did xl destroy manually, and xl create, system
> panic happened.

Could you also add the output of `top -n1` to see where memory is
going?

I'm quite sure we have a leak in some of the backends, maybe the
bounce buffer used by blkback.

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YiCa70%2BHQScsoaKX>